home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Monster Media 1996 #15
/
Monster Media Number 15 (Monster Media)(July 1996).ISO
/
win_utl2
/
pspa370a.zip
/
POM.DOC
< prev
next >
Wrap
Text File
|
1996-04-26
|
287KB
|
6,747 lines
===========================================================================
===========================================================================
============================ ============================
============================ ============================
============================ PARSE-O-MATIC ============================
============================ ============================
============================ ============================
===========================================================================
===========================================================================
Copyright (C) 1986, 1996 by Pinnacle Software (Montreal)
+------------------------ MULTI-PLATFORM SUPPORT -------------------------+
| |
| |
| Runs under MS-DOS, Windows (including Win95) and OS/2 |
| |
| |
+------- HERE ARE A FEW OF THE THINGS PARSE-O-MATIC CAN DO FOR YOU -------+
| |
| |
| Importing Exporting Automated Editing |
| Text Extraction Data Conversion Table Lookup |
| Retabulation Info Weeding Selective Copying |
| Binary-File to Text Report Reformatting Wide-Text Folding |
| Auto-Batch Creation Comm-log Trimming Tab Replacement |
| Character Filtering Column Switching DBF Interpretation |
| De-uppercasing Name Properization And much more! |
| |
| |
+---- INPUT AND OUTPUT METHODS CURRENTLY SUPPORTED BY PARSE-O-MATIC ------+
| |
| |
| Input: Text (any format), Binary, DBF (DBase), Fixed-Record-Length, |
| Variable-Record-Length, EBCDIC |
| |
| Output: Text (e.g. flat, comma-delimited, paginated, hex), Binary, |
| Fixed-Record-Length, Variable-Record-Length, EBCDIC, |
| Generic output devices (e.g. COM1: or LPT2:) |
| |
| |
+-------------------------------------------------------------------------+
+------------------------ WHO USES PARSE-O-MATIC? ------------------------+
| |
| |
| Our corporate customers include: |
| |
| Boeing, CompUSA, Dresdner Bank, Eaton, Eddy, First Federal Bank, |
| Georgia Gulf, Harris Semiconductor, Home Box Office (HBO), Hughes, |
| McCain Foods, McDonald's, Nestle, Nike, Pacific Gas and Electric, |
| Philip Morris, Pitney Bowes, Prentice Hall, Procter & Gamble, |
| Royal Bank, Royal Caribbean, Sun Life, Visa International |
| |
| |
+----------------------- UNSOLICITED TESTIMONIALS ------------------------+
| |
| |
| "Parse-O-Matic is absolutely great. I use it when I collect data |
| from the McDonald's restaurants in Switzerland. POM has paid for |
| itself so many times ..." -- Chris Friedli |
| |
| |
| "Parse-O-Matic is a wonderful time saver .... Each report that I |
| can convert from our ... accounting system saves our company about |
| 500 man hours per year." -- R. Brooker |
| |
| |
| "In 30 years of working with computers, this is by far the easiest |
| way I have found to extract data from files. I was very surprised |
| that the program just took a few seconds to chew its way through |
| 1MB of data. You ought to mention it's FAST." -- Koenraad Rutgers |
| |
+-------------------------------------------------------------------------+
===========================================================================
AN OVERVIEW OF THIS MANUAL
===========================================================================
Introduction . . . . . . . . What is Parse-O-Matic?
Parse-O-Matic Versus Automatic Converters
Why You Need Parse-O-Matic -- An Example
Parse-O-Matic to the Rescue!
How it Works
How to Contact Us
Fundamentals . . . . . . . . The Parse-O-Matic Command
The POM File
Padding for Clarity
A Simple Example
Quick Reference . . . . . . Command Descriptions
Command Formats
Basic Commands . . . . . . . Set, If
Output Commands . . . . . . OFile, Out, OutEnd, OutHdg, OutPage, PageLen
Input Commands . . . . . . . The Get Command
Variable Length Records
Delimiter-Terminated Data
Handling Long Strings
The GetText Command
The ReadNext Command
End of File Conditions
Optional Comparisons
Ignoring Null Lines
Saving the Previous Line
Input Filters . . . . . . . Minlen, Ignore, Accept
Flow Control Commands . . . Begin, Else, End, Again, Done, NextFile,
Halt, Prologue, Epilogue
Variable Modifiers . . . . . The Trim Command
The Pad Command
The Change Command
The CvtCase Command
Control Settings
The Proper Command
The Insert Command
The Append Command
The MapFile Command
What is a Map File?
Sample Map Files
Map File Format
Search Order
Case Matching
Reverse Mapping
Irreversible Mapping
Memory Limitations
An Example of Remapping
The Remap Command
Remap Versus Change
Using Remap
Free-Form Commands . . . . . What are Free-Form Commands?
The Parse Command
Decapsulators
Sample Application
The Occurence Number
Finding the Last Occurence
Unsuccessful Searches
The Control Setting
The Plain Decapsulator
The Null Decapsulator
Null Decapsulators Versus Exclusion
Overlapping Decapsulators
Parsing Empty Fields
Additional Examples
The Peel Command
The Control Setting
Parsing Empty Fields
The Left-Peeling Method
Positional Commands . . . . General Discussion
What are Positional Commands?
Why Use Positional Commands?
A Cautionary Note
The SetLen Command
The Delete Command
The Copy Command
The Extract Command
The FindPosn Command
The Plain String Find
Using a Single Decapsulator
The Encapsulated String Find
Control Settings
Insoluble Searches
Null Decapsulators
Finding The Last Word
Who Needs This?
Date Commands . . . . . . . General Discussion
The POMDATE.CFG File
Date Formats
The Today Command
The Date Command
The MonthNum Command
The ZeroDate Command
Calculation Commands . . . . Calc, CalcReal, CalcBits
Input Preprocessors . . . . Split, Chop
Lookup Commands . . . . . . The LookUp Command
Search Method
Limitations
Null Lines and Comments
Multiple Columns
LookUp Versus Remap
The LookFile Command
The LookCols Command
The LookSpec Command
Data Converters . . . . . . The MakeData Command
Creating Binary Data
Converting Dates
Practical Considerations
The MakeText Command
Converting Binary Data
Converting Dates
Practical Considerations
Miscellaneous Commands . . . The Erase Command
The GetEnv Command
Disappearing Environment Variables
Examples
The Log Command
The MsgWait Command
Standard Behaviour
Setting a Time-Out Delay
Color Cues
Key Stacking
Exceptions
A Word of Caution
The Pause Command
The Sound Command
The LISTEN Utility
Changing the Error Message Sound
The Trace Command
Terms . . . . . . . . . . . Values
Delimiters
Illegal Characters
Using Comparators
Predefined Data Types
Techniques . . . . . . . . . Uninitialized and Persistent Variables
Inline Incrementing and Decrementing
Line Counters
Tracing
Logging
Quiet Mode
The ShowNum Utility
File Handling . . . . . . . How Parse-O-Matic Searches for a File
How Parse-O-Matic Opens an Output File
Appending to an Output File
Sending Output to a Device
DBF Files
POM and Wildcards
Solving Memory Problems
Operational Planning . . . . Effective Use of Batch Files
Running Parse-O-Matic from Another Program
Unattended Operation
Examples
Running under Windows . . . Compatibility
Setting Up for Windows 95
Installing the ShowNum Utility
Long File Names in Win95
Licensing . . . . . . . . . Trial License
Single-User License
Site and Multi-Copy Licenses
LAN and WAN Licenses
Distribution License
Retail License
===========================================================================
INTRODUCTION
===========================================================================
----------------------
What is Parse-O-Matic?
----------------------
Parse-O-Matic is a programmable file-parser. Simple enough for even a non-
programmer to master, it can help out in countless ways. If you have a
file you want to edit, manipulate, or change around, Parse-O-Matic may be
just the tool you need. Parse-O-Matic can also speed up or automate long
or repetitive editing tasks.
-----------------------------------------
Parse-O-Matic Versus Automatic Converters
-----------------------------------------
Parse-O-Matic is not an "automatic file converter". It will not, for
example, convert WordPerfect files to MS-Word format, or convert Lotus
1-2-3 Spreadsheets DIRECTLY to Excel files -- although it can read reports
from one program and convert them to another format (e.g. comma-delimited),
which can be imported by the other program.
One advantage of this method (as opposed to automatic file conversion) is
that you can create an "intelligent" importing procedure, which can make
decisions and modify data. You could, for example, eliminate certain types
of records, tidy up names, convert case, unify fields, make calculations,
and so on.
----------------------------------------
Why You Need Parse-O-Matic -- An Example
----------------------------------------
There are plenty of programs out there that have valuable data locked away
inside them. How do you get that data OUT of one program and into another
one?
Some programs provide a feature which "exports" a file into some kind of
generic format. Perhaps the most popular of these formats is known as a
"comma-delimited file", which is a text file in which each data field is
separated by a comma. Character strings -- which might themselves contain
commas -- are surrounded by double quotes. So a few lines from a
comma-delimited file might look something like this (an export from a
hypothetical database of people who owe your company money):
"JONES","FRED","1234 GREEN AVENUE", "KANSAS CITY", "MO",293.64
"SMITH","JOHN","2343 OAK STREET","NEW YORK","NY",22.50
"WILLIAMS","JOSEPH","23 GARDEN CRESCENT","TORONTO","ON",16.99
Unfortunately, not all programs export or import data in this format.
Even more frustrating is a program that exports data in a format that is
ALMOST what you need!
If that's the case, you might decide to spend a few hours in a text editor,
modifying the export file so that the other program can understand it. Or
you might write a program to do the editing for you. Both solutions are
time-consuming.
An even more challenging problem arises when a program which has no export
capability does have the ability to "print" reports to a file. You can
write a program to read these files and convert them to something you can
use, but this can be a LOT of work!
----------------------------
Parse-O-Matic to the Rescue!
----------------------------
Parse-O-Matic is a utility that reads a file, interprets the data, and
outputs the result to another file. It can help you "boil down" data to
its essential information. You can also use it to convert NEARLY
compatible import files, or generate printable reports.
------------
How It Works
------------
You need three things:
1) The Parse-O-Matic program
2) A Parse-O-Matic "POM" file (to tell Parse-O-Matic what to do)
3) The input file
The input file might be a report or data file from another program, or text
captured from a communications session. Parse-O-Matic can handle many
types of input. We've provided several sample input files. For example,
the file XMPDAT02.TXT comes from the AccPac accounting software. AccPac is
a great program, but its export capabilities leave something to be desired.
Parse-O-Matic can help!
To see detailed demonstrations of how various files can be parsed, enter
START at the DOS prompt (or run START.BAT from Windows or OS/2), then
select TUTORIAL from the menu.
-----------------
How To Contact Us
-----------------
If you have any questions about Parse-O-Matic, you can write to us at the
following address:
Pinnacle Software, CP386, Mount Royal, QC, Canada H3P 3C6
You can also contact us electronically at the following addresses:
Voice Line: 514-345-9578
Free Files BBS: 514-345-8654
Internet Email: pinnacl@cam.org
World Wide Web: http://www.cam.org/~pinnacl
CompuServe: 70154,1577
===========================================================================
FUNDAMENTALS
===========================================================================
This documentation assumes that you are an experienced computer user. If
you have trouble, you might ask a programmer to help you -- POM file
creation is a little like programming!
-------------------------
The Parse-O-Matic Command
-------------------------
The basic format of the Parse-O-Matic command line is:
POM pom-file input-file output-file
Here is an example, as you would type it at the DOS command line, or as a
command in a batch file: POM POMFILE.POM REPORT.TXT OUTPUT.TXT
For a more formal description of the command line, start up POM by typing
this command at the DOS prompt: POM
------------
The POM File
------------
The POM file is a text file with a .POM extension. The following
conventions are used when interpreting the POM file:
- Null lines and lines starting with a semi-colon (comments) are ignored.
- A POM file may contain up to 750 lines of specifications.
Comment lines do not count in this total.
A POM file does not rely on "loops" (to use the programming term). Each
line or record of the input file is processed by the entire POM file. If
you would like this expressed in terms of programming languages, here is
what POM does:
+-------------------------------------------------------------------------+
| START: If there's nothing left in the input file, go to QUIT. |
| Read a line from the input file |
| Do everything in the POM file |
| Go to START |
| QUIT: Tell the user you're finished! |
+-------------------------------------------------------------------------+
The method by which Parse-O-Matic finds the POM file is discussed in the
section "How Parse-O-Matic Searches for a File".
-------------------
Padding for Clarity
-------------------
Spaces and tabs between the words and variables in a POM file line are
generally ignored (except in the case of the "output picture" of the OUT
and OUTEND commands). You can use spaces to make the commands in your POM
files easier to read.
Additionally, in any line in the POM file, the following terms are ignored:
THEN ELSE
(There is a POM command named ELSE, but Parse-O-Matic can tell that this is
not "padding".)
Finally, the equals ("=") character is ignored if it is found in a place
where no comparison is taking place. This will be demonstrated below.
You can use these techniques to make your POM files easier to read. For
example, the IF command can be written in several ways:
Very terse: IF PRICE = "0.00" BONUS "0.00" "1.00"
Padded with spaces: IF PRICE = "0.00" BONUS "0.00" "1.00"
Fully padded: IF PRICE = "0.00" THEN BONUS = "0.00" ELSE "1.00"
In the last example, the first equals sign ("=") is a "comparator". (For
details about comparators, see the section entitled "Using Comparators".)
The second equals sign is not really required, but it does make the line
easier to understand.
----------------
A Simple Example
----------------
Let's say you have a text file called NAMES.TXT that looks like this:
WILLIAMS JACK
SMITH JOHNNY
JOHNSON MARY
: :
Column 1 Column 12
Now let's say you want to switch the columns, so that the first name
appears first. Your first step is to create a file using a text editor.
The file would look like this:
SET last = $FLINE[ 1 10]
SET first = $FLINE[12 17]
PAD first "R" " " "10"
OUTEND |{first} {last}
The first two lines tell Parse-O-Matic to which text to extract from each
input line. For the first line of the input file, the variable named
'last' will be given the value "WILLIAMS ". You will notice there are two
spaces at the end. That is because we take every character from position 1
to position 10 -- which in this case includes two spaces.
The PAD line adds enough spaces on the right side of the variable named
'first' to make sure that it is 10 characters long. The OUTEND command
sends the two variables to the output file.
Save the file with the name TEST.POM and exit your text editor. At the DOS
prompt, enter this command:
POM TEST.POM NAMES.TXT OUTPUT.TXT
This will run the POM file (TEST.POM) on every line of the input file
(NAMES.TXT) and place the output in the file OUTPUT.TXT, which will then
look like this:
JACK WILLIAMS
JOHNNY SMITH
MARY JOHNSON
: :
Column 1 Column 12
Of course, for such a simple task, it would be easier to switch the columns
yourself, using a text editor. But when you are dealing with large amounts
of data, and want to guard against typing errors, Parse-O-Matic can save
you a lot of time, effort and risk. It also lets you automate editing
operations that you perform frequently.
===========================================================================
QUICK REFERENCE
===========================================================================
--------------------
Command Descriptions
--------------------
This manual's explanations of the commands are grouped by related
functions, in the following order:
---------------------------------------------------------------------------
BASIC COMMANDS
---------------------------------------------------------------------------
SET Assigns a value to a variable
IF Conditionally assigns a value to a variable
---------------------------------------------------------------------------
OUTPUT COMMANDS
---------------------------------------------------------------------------
OFILE Specify output file or device
OUT Sends text and variables to the output file
OUTEND Like OUT but adds a new line at the end (Carriage Return/Linefeed)
OUTHDG Sets up title lines to appear at the top of a report or each page
OUTPAGE Starts a new page
PAGELEN Sets the page length for a report
---------------------------------------------------------------------------
INPUT COMMANDS
---------------------------------------------------------------------------
GET Manually reads bytes from the input file
GETTEXT Manually reads bytes from the input file and converts them to text
READNEXT Moves to next input line but retains your place in the POM file
---------------------------------------------------------------------------
INPUT FILTERS
---------------------------------------------------------------------------
MINLEN Sets the minimum length required for an input line to be processed
IGNORE Ignores an input line that meets the specified condition
ACCEPT Accepts an input line that meets the specified condition
---------------------------------------------------------------------------
FLOW CONTROL COMMANDS
---------------------------------------------------------------------------
BEGIN Defines the conditions for processing the code block
ELSE Defines the start of code to be processed if the BEGIN fails
END Marks the end of a BEGIN/END or BEGIN/ELSE/END code block
AGAIN Conditionally returns to the corresponding BEGIN command
DONE Reads the next input line and starts at the top of the POM file
NEXTFILE Skips the current input file and proceeds to the next one
HALT Terminates all processing if a given condition exists
PROLOGUE Defines code block to run before input lines are processed
EPILOGUE Defines code block to run after input lines are processed
---------------------------------------------------------------------------
VARIABLE MODIFIERS
---------------------------------------------------------------------------
TRIM Removes a character from the left, right or all of a variable
PAD Centers, or left/right-justifies variable to a specified width
CHANGE Replaces all occurrences of a string in a variable
PROPER Properizes a variable (e.g. "JOHN SMITH" becomes "John Smith")
INSERT Inserts a string on the left or right, or at a "found" position
APPEND Concatenates several variables into one variable
CVTCASE Converts a value to uppercase or lowercase
MAPFILE reads a file containing data transformations for REMAP
REMAP transforms sub-strings into other strings
---------------------------------------------------------------------------
FREE-FORM COMMANDS
---------------------------------------------------------------------------
PARSE Obtains a variable found between delimiters in free-form data
PEEL Works like PARSE, but removes the "found" text from the data
---------------------------------------------------------------------------
POSITIONAL COMMANDS
---------------------------------------------------------------------------
SETLEN Sets a variable according to the length of a value
DELETE Removes a range of characters from a variable
COPY Copies a range of characters from a value to a variable
EXTRACT Like COPY, but removes the characters from the source variable
FINDPOSN Finds the starting or ending position of a value in another
---------------------------------------------------------------------------
DATE COMMANDS
---------------------------------------------------------------------------
TODAY Sets a variable to today's date, in a variety of formats
DATE Sets a given year, month and day, in a variety of formats
MONTHNUM Sets the month number of a given month, expressed as text
---------------------------------------------------------------------------
CALCULATION COMMANDS
---------------------------------------------------------------------------
CALC Performs arithmetic functions on integer values
CALCREAL Performs arithmetic functions on decimal values
---------------------------------------------------------------------------
INPUT PREPROCESSORS
---------------------------------------------------------------------------
SPLIT Breaks up a wide text file (more than 255 characters)
CHOP Breaks up a fixed-record-length file
---------------------------------------------------------------------------
LOOKUP COMMANDS
---------------------------------------------------------------------------
LOOKUP Looks up a word in another file and returns a corresponding value
LOOKFILE Specifies the file that the LOOKUP command will use (see also /L)
LOOKCOLS Specifies the format of the look-up file
LOOKSPEC Controls the behaviour of the LOOKUP command
---------------------------------------------------------------------------
DATA CONVERTERS
---------------------------------------------------------------------------
MAKEDATA Converts text into binary format
MAKETEXT Converts binary format into text
---------------------------------------------------------------------------
MISCELLANEOUS COMMANDS
---------------------------------------------------------------------------
ERASE Deletes a file
GETENV obtains a system environment variable (e.g. PATH)
LOG Adds a line to the processing log
MSGWAIT controls the behaviour of error messages
PAUSE Delays the specified number of milliseconds
SOUND Makes a noise or sets the noise generated by error messages
TRACE Traces a variable (results saved in the text file POM.TRC)
---------------
Command Formats
---------------
---------------------------------- ---------------------------------------
COMMAND FORMATS EXAMPLE
---------------------------------- ---------------------------------------
ACCEPT val c val ACCEPT $FLINE[1 3] = "YES"
AGAIN [val c val] AGAIN linecntr #< "3"
APPEND var val val [val [val]] APPEND name first last
BEGIN [val c val] BEGIN linecntr #< "3"
CALC var num operation num CALC total total "+" sold
CALCBITS var char operation char CALCBITS z byte1 "XOR" $80
CALCREAL var num operation num CALCREAL salary hours "*" rate
CHANGE var val val CHANGE date "/" "-"
CHOP from to [,from to] [...] CHOP 1 250, 251 300
COPY var val from [to] COPY x $FLINE "3" "5"
CVTCASE var val [ctl] CVTCASE x $FLINE "LA"
DATE var num num num [ctl] DATE x "98" "12" "31"
DELETE var from [to] DELETE x "3" "5"
DONE [val c val] DONE $FLINE = "End Data"
ELSE ELSE
END END
EPILOGUE EPILOGUE
ERASE file ERASE "C:\MYFILES\OUT.TXT"
EXTRACT var var from [to] EXTRACT x $FLINE "15" "30"
FINDPOSN var val left [right [ctl]] FINDPOSN x $FLINE "2*/"
GET var ctl [ctl [ctl]] GET x #0 "END" "I"
GETENV var val GETENV x "COMSPEC"
GETTEXT var ctl [ctl] GETTEXT date "WORD" "DATE"
HALT val c val val [ctl] HALT x = y "Item repeated"
IF val c val var val [val] IF x = "Y" THEN z = "N"
IGNORE val c val IGNORE price = "0.00"
INSERT var ctl val INSERT price "L" "$"
LOG val c val val [val [val]] LOG x = y "Item repeated"
LOOKCOLS num num num num LOOKCOLS "1" "3" "8" "255"
LOOKFILE file LOOKFILE "C:\TABLES\DATA.TBL"
LOOKSPEC ctl ctl ctl LOOKSPEC "Y" "N" "N"
LOOKUP var val LOOKUP phonenum "FRED JONES"
MAKEDATA var val ctl MAKEDATA x "255" "BYTE"
MAKETEXT var val ctl MAKETEXT z x "BYTE"
MAPFILE file val [ctl] MAPFILE "XYZ.MPF" "XYZ" "ANYCASE"
MINLEN num [num] MINLEN "15" "1"
MONTHNUM var val MONTHNUM x "February"
MSGWAIT num MSGWAIT "60"
NEXTFILE [val c val] NEXTFILE $FLINE = "End File"
OFILE file OFILE "C:\MYFILES\OUT.TXT"
OUT [val c val] |pic OUT z = "X" |{price}
OUTEND [val c val] |pic OUTEND z = "X" |{$FLINE}
OUTHDG val OUTHDG "LIST OF EMPLOYEES"
OUTPAGE [val c val] OUTPAGE partnum <> oldpartnum
PAD var ctl char num PAD sernum "L" "0" "10"
PAGELEN num [ctl] PAGELEN "66" "N"
PARSE var val left right [ctl] PARSE x $FLINE "2*(" "3*)" "I"
PAUSE val PAUSE "1000"
PEEL var var left right [ctl] PEEL x $FLINE "2*(" "3*)" "I"
PROLOGUE PROLOGUE
PROPER var [ctl [file]] PROPER custname "I" "XY.PEF"
READNEXT [val c val] READNEXT $FLINE[1 5] = "NOTE:"
REMAP var [val] REMAP $FLINE "BIN2CODE"
SET var val SET name $FLINE[20 26]
SETLEN var val SETLEN length custname
SOUND ctl SOUND "BUZZ"
SPLIT from to [,from to] [...] SPLIT 1 250, 251 300
TODAY var [ctl] TODAY x "?y/?n/?d"
TRACE var TRACE price
TRIM var ctl char TRIM price "R" "$"
ZERODATE val val val ZERODATE "1753" "12" "31"
---------------------------------- ---------------------------------------
The following conventions are used in the preceding table:
c Comparator (if omitted, defaults to "equals" comparison)
char Variable or literal: must be a single byte or character
ctl Variable or literal: command control specifications
file File name (see "How Parse-O-Matic Searches for a File")
from Variable or literal: a starting character position (see Note #1)
left Variable or literal: see "Decapsulators"
num Variable or literal: must contain a number (see Note #1)
pic Output picture used by OUT and OUTEND
right Variable or literal: see "Decapsulators"
to Variable or literal: an ending position (see Note #1)
val Variable or literal whose value is being read
var Variable that is being set
[xxx] Square brackets indicate optional items
Note #1: Tabs, spaces and commas are stripped from numeric values
The commands are explained in detail in the following section. A summary
of the commands and default settings appear, as comments, in the file
QUICKREF.POM. You can copy these comments into your own POM file as
a convenient quick reference.
===========================================================================
BASIC COMMANDS
===========================================================================
---------------
The SET Command
---------------
FORMAT: SET var1 value1
PURPOSE: SET assigns a value1 to a the variable var1.
ALTERNATIVES: The COPY command.
The usual reason to use the SET command is to set a variable from the input
line (represented by the variable $FLINE) prior to cleaning it up with
TRIM. For example, if the input line looked like this:
JOHN SMITH 555-1234 322 Westchester Lane Architect
| | | | |
Column 1 Col 12 Col 22 Col 33 Col 57
then we could extract the last name from the input line with these two POM
commands:
SET NAME = $FLINE[12 21] (Sets the variable from the input line)
TRIM NAME "R" " " (Trims any spaces on the right side)
SET would first set the variable NAME to this value: "SMITH "
After the TRIM, the variable NAME would have the value: "SMITH"
You will also use SET if you plan to include a portion of text string in
the output, since the OUT and OUTEND commands do not recognize substrings
after the "|" marker; they only recognize plain text and complete
variables.
--------------
The IF Command
--------------
FORMAT: IF value1 [comparator] value2 var1 value3 [value4]
PURPOSE: If value1 equals value2, var1 is set to value3. Otherwise,
it is set to value4 (if value4 is missing, nothing is done,
and var1 is not changed).
NOTES: For an explanation of comparators, see "Using Comparators".
In the following explanation, we will demonstrate the
command using only the "literally identical" ("=")
comparator.
ALTERNATIVES: The BEGIN command
Here is an example of the IF command...
SET EARNING = $FLINE[20 23]
IF EARNING = "0.00" THEN BONUS = "0.00" ELSE "1.00"
This obtains the value between columns 20 and 26, then checks if it equals
"0.00". If it does, the variable BONUS is set to 0.00. If not, BONUS is
set to "1.00". The "THEN" and "ELSE" are "padding" and can be omitted.
===========================================================================
OUTPUT COMMANDS
===========================================================================
-----------------
The OFILE Command
-----------------
FORMAT: OFILE value1
PURPOSE: OFILE specifies a new output file or device.
PARAMETERS: value1 is the name of the output file (or device)
ALTERNATIVES: Specify the name on the POM command line.
SEE ALSO: "How Parse-O-Matic Opens an Output File"
"Sending Output to a Device"
When you start up Parse-O-Matic, you can specify the name of the output
file on the command line. For example:
POM MYPOM.POM INPUT.TXT OUTPUT.TXT
In this case, the output file is named OUTPUT.TXT. All data from the
output commands (OUT, OUTEND etc.) are sent to this file. If you omit the
output file name from the POM command, like this:
POM MYPOM.POM INPUT.TXT
then Parse-O-Matic assumes the output file is named POMOUT.TXT (in the
current directory).
Once the name of the output file has been determined, Parse-O-Matic will
use that file until it is changed, using the OFILE command. For example:
OFILE "C:\XYZ.TXT"
This will change the output file to C:\XYZ.TXT. If the file already
exists, it will be renamed with a BAK extension. However, you can tell
Parse-O-Matic to append to the end of an existing file by placing a plus
sign in front of the file name:
OFILE "+C:\XYZ.TXT"
(See "Appending to Output Files" and "POM and Wildcards" for additional
details on appending to output files).
---------------------------
The OUT and OUTEND Commands
---------------------------
FORMAT: OUT[END] [value1 [comparator] value2] |output-picture
PURPOSE: The OUT command generates output without an end-of-line
(i.e. carriage return and linefeed characters).
The OUTEND command generates output and also adds an
end-of-line.
NOTES: For an explanation of comparators, see "Using Comparators".
In the following explanation, we will demonstrate the
command using only the "literally identical" ("=")
comparator.
When value1 equals value2, a line is sent to the output file, according to
the output picture. Within the output picture, all text is taken literally
(i.e. " is taken to mean literally that -- a quotation mark character).
The only exception to this is variable names, which are identified by the
{ and } characters. For example, a POM file that contained the following
single line:
OUTEND "X" = "X" |{$FLINE}
would simply output every line from the input file (not very useful!).
The "X" = "X" part of the command is the comparison which controls when
output occurs. In the example above, both values being compared are the
same, so output will always occur.
You can not use substrings after the "|" marker. Thus, the following line
is NOT legal: OUTEND $FLINE[1 3] = "IBM" |{$FLINE[1 15]}
The correct way to code this is as follows:
SET CODE = $FLINE[1 15]
OUTEND $FLINE[1 3] = "IBM" |{CODE}
This outputs the first 15 characters of any line that contains the letters
"IBM" in the first three positions.
------------------
The OUTHDG Command
------------------
FORMAT: OUTHDG value1
PURPOSE: OUTHDG is used to place text headers in your output.
ALTERNATIVES: The OUTEND command, used in conjunction with PROLOGUE.
SEE ALSO: "The PageLen Command" and "The OutPage Command"
If you were parsing data to create an employee report, you might use OUTHDG
like this:
SET EMPNUM = $FLINE[ 1 5]
SET NAME = $FLINE[10 28]
SET PHONE = $FLINE[30 45]
OUTHDG "EMPL# NAME PHONE NUMBER"
OUTHDG "----- ------------------- ------------"
OUTEND |{EMPNUM} {NAME} {PHONE}
The value following the OUTHDG command is sent to the output file only
once. That is to say, after an OUTHDG sends a value to the output file,
subsequent encounters with that OUTHDG command are ignored -- unless the
PAGELEN command is used.
To specify a blank line in a heading, use the following command: OUTHDG ""
If your output is bound for a continuous-paper printer (e.g. a dot-matrix
printer with tractor feed), you may find it useful to use one or more blank
lines at the beginning of the header, in order to skip over the perforation
in the paper.
-------------------
The OUTPAGE Command
-------------------
FORMAT: OUTPAGE [value1 [comparison] value2]
PURPOSE: Sends a page eject to the output file (or device).
NOTES: For an explanation of comparators, see "Using Comparators".
SEE ALSO: "The Pagelen Command" and "The OutHdg Command"
If the comparison in the OUTPAGE command is true, or if it is omitted,
OUTPAGE will send a "page eject" to the output file or device. (See
"Sending Output to a Device") Some exceptions apply, however. The page
eject is not sent under the following circumstances:
- If the comparison is false (e.g. OUTPAGE "Y" = "N")
- If the page length is set to "0" (the default). Use the PAGELEN command
to specify a different page length.
- If the output file is not yet open. That is to say, if no output has
been sent to the output via one of the other output commands (e.g. OUT,
OUTEND, OUTHDG), then OUTPAGE will do nothing. (See "How Parse-O-Matic
Opens an Output File")
- If the output is already at the top of a page.
If form feeds are enabled (via the PAGELEN command), OUTPAGE sends a
page eject by sending a Form Feed character (ASCII 12) to the output.
If form feeds are not enabled, OUTPAGE sends blank lines (i.e. linefeeds)
until the requisite number of lines appear on the page.
OUTPAGE does NOT automatically place OUTHDG text at the top of the page.
OUTHDG text is not "stored"; it is executed in the POM file at the place
it occurs. Here is an example of using OUTPAGE and OUTHDG together:
PAGELEN "55" "Y"
SET partnum = $FLINE[ 1 7]
SET descrip = $FLINE[12 60]
OUTPAGE partnum <> oldpartnum
OUTHDG |PARTNUM DESCRIPTION
OUTHDG |------- -----------
OUTEND |{partnum} {descrip}
SET oldpartnum = partnum
This will generate a new page, complete with headings, when the partnum
variable is different from the oldpartnum variable. Also, because of the
interaction between OUTHDG and PAGELEN, they headings will appear on a new
page if you run out of room on the current page.
-------------------
The PAGELEN Command
-------------------
FORMAT: PAGELEN value1 [value2]
PURPOSE: The PAGELEN command specifies the length of the output page.
PARAMETERS: value1 is the page length
value2 specifies if form feeds should be used
NUMERICS: Tabs, spaces and commas are stripped from value1
DEFAULTS: value2 = "Y"
When text is sent to an output file by OUTHDG and OUTEND, the lines are
counted. The default value for page length is zero, which means that the
output is a single page of infinite length. As such, OUTHDG headings
appear only the first time they are encountered, and OUTPAGE commands
are ignored.
If you specify a page length greater than zero, OUTHDG headings become
re-enabled once the specified number of output lines have been generated,
or after an OUTPAGE command is performed. A typical value is as follows:
PAGELEN "55"
This is an ideal page length for most laser printers. Dot matrix printers
typically use a page length of 66.
Parse-O-Matic inserts a "form feed" (ASCII 12) character between pages.
You can turn this off, however, by specifying the page length this way:
PAGELEN "66" "N"
The "N" specification means, "No, don't use form feeds". Another
acceptable value is "Y", meaning "Yes, use form feeds", but since this is
the default, you do not have to specify it.
===========================================================================
INPUT COMMANDS
===========================================================================
---------------
The GET Command
---------------
** ADVANCED COMMAND FOR EXPERIENCED USERS **
FORMAT: GET var1 value1 [value2] (Variable length records)
GET var1 value1 "END" [value3] (Delimiter-terminated data)
PURPOSE: Manually reads bytes from the input file.
NOTES: Data is normally read automatically from the input file.
GET is used only when you want precise control of the
reading process. GET works only with files whose format
is defined by the CHOP command. (You can read a file
a byte at a time by using CHOP 1-1 in your POM file.
You can also use CHOP 0 to do all reading manually.)
SEE ALSO: "The Chop Command"
The GET command is especially helpful for:
1) Variable length records
2) Delimiter-terminated data (such as zero-terminated text strings)
These methods are described in detail below.
Variable Length Records
-----------------------
FORMAT: GET var1 value1 [value2]
PURPOSE: Reads a variable-length record.
PARAMETERS: var1 is the variable being set
value1 specifies how many bytes to read, expressed as:
A value in text format (example: GET x "10")
A predefined data type (example: GET x "INTEGER")
A value in byte format (example: GET x len "BYTE")
value2 specifies the data representation used by value1
This can be "TEXT" (the default) or "BYTE"
NUMERICS: Tabs, spaces and commas are stripped from value1, if it is
numeric, and in text format
DEFAULTS: value2 = "TEXT"
SEE ALSO: "Predefined Data Types"
GET can read up to 255 bytes into a variable, as specified by value1.
For example:
GET xyz "10"
This reads 10 bytes from the input file into the xyz variable, and advances
the file pointer. That is to say, after the GET command shown above is
executed, the next data Parse-O-Matic reads will be 10 bytes further along.
If the requested number of bytes is not available in the input file,
Parse-O-Matic terminates with an error message.
In a typical application, variable-length data is preceded in the input
file by a byte that gives its length. You can read the length, then use
it directly, as follows:
GET len "1" "TEXT" <-- Get the length byte
GET xyz len "BYTE" <-- Read in the data
In the first command, the word "TEXT" means that the length specification
(i.e. "1") is plain text. ("TEXT" is the default, so you can omit it.)
In the second command, GET reads len bytes from the input file. The word
"BYTE" means that the length specification is a binary number, not a text
string.
To clarify this, let us assume that the input file contains a length byte
(say hex 4F, which equals 79 in decimal). This is followed by 79 bytes of
data. The first GET command (GET len "1") reads in the length byte (hex 4F
or decimal 79). The second GET command (GET xyz len "BYTE") reads 79 bytes
and places the result in the xyz variable.
The maximum variable length that a single GET command can handle is 255
bytes (i.e. the largest number represented by a single byte).
Here are some additional examples of the GET command:
SAMPLE COMMAND EXPLANATION
----------------- -----------
GET x "5" "TEXT" Reads 5 bytes into the x variable
GET x "5" Same as above (since "TEXT" is the default)
GET x len In this case, len must contain a text number (e.g. "7")
GET x len "BYTE" In this case, len must be a byte (i.e. binary format)
When the number is in "TEXT" format, spaces and tabs are ignored. Thus, the
following command is valid:
GET abc " 5 " "TEXT"
You can also specify the length of the data as a predefined data type (see
"Predefined Data Types" and "The MakeData Command"). Some examples...
SAMPLE COMMAND EXPLANATION
----------------- -----------
GET x "INTEGER" Reads in an integer value (2 bytes long)
GET x "SHORTINT" Reads in a short integer value (1 byte long)
GET x "BYTE" Reads in a byte value (1 byte long)
GET x "LONGINT" Reads in a long integer (4 bytes long)
GET x "REAL" Reads in a real value (6 bytes long)
GET x "REAL 2" Same as above (the decimal precision value 2 is ignored)
TECHNICAL NOTE: In some applications, you will find that a variable-length
record may be followed by a "noise" byte. This can occur if the program
that created the input file "aligns data to word boundaries" and the record
you are reading has an odd number of bytes. In such case, your POM file
must determine (using CALC commands) if the length byte is odd or even, and
react accordingly.
Delimiter-Terminated Data
-------------------------
FORMAT: GET var1 value1 "END" [value3]
PURPOSE: Reads delimiter-terminated data from the input file.
PARAMETERS: var1 is the variable being set
value1 is the terminating character you are searching for
"END" means you are searching for a terminating character
value3 is "I" (for Include) or "X" (for eXclude)
DEFAULTS: value3 = "X"
ALTERNATIVES: The PARSE and PEEL commands
The FINDPOSN command used with the COPY command
One common way to represent variable-length text data in a file is to
terminate the text string with the null (ASCII 0) character. You can
read in this kind of data with the GET command, as follows:
GET abc #0 "END" <-- #0 means ASCII zero (See "Values")
This reads the input file until the null (ASCII 0) character is found, or
until 255 characters have been read in (whichever comes first).
The terminating character is not included in the string unless you
explicitly request it. There are two forms of GET command that control
this behaviour:
GET abc #0 "END" "X" <-- Exclude the terminating character (default)
GET abc #0 "END" "I" <-- Include the terminating character
Here is a sample POM file that reads a data file that consists entirely of
zero-terminated strings:
CHOP 0 <-- This means you will handle all file reading
GET abc #0 "END" <-- Read in the data
OUTEND |{abc} <-- Send the data to the output file
Handling Long Strings
---------------------
If some of the data is more than 255 characters long, you can handle it as
follows:
CHOP 0 <-- Handle all file reading manually
GET data #0 "END" "I" <-- Include the terminating character
SETLEN len data <-- Get the length of the string
COPY lastchar data len <-- Get the last character
BEGIN lastchar = #0 <-- Test the last character
DELETE data len <-- Remove the last character (the terminator)
OUTEND |{data} <-- Output the string, and start a new line
ELSE
OUT |{data} <-- Output the string, but stay on the same line
END
All of the examples given above assume that the terminating character is
ASCII 0 (i.e. #0), because this is by far the most common terminator.
However, you can use other values, if required:
GET data "X" "END"
In actual usage, it is not likely that you will find data strings that are
terminated by an "X" character, but the capability is there if the need
arises.
-------------------
The GETTEXT Command
-------------------
** ADVANCED COMMAND FOR EXPERIENCED USERS **
FORMAT: GETTEXT var1 value1 [value2]
PURPOSE: Manually reads bytes from the input file, then converts
them into text format.
PARAMETERS: var1 is the variable being set
value1 is the predefined data type in the input file
value2 is the MAKETEXT "convert from" parameter
DEFAULTS: If value2 is omitted, it is assumed to be the same as value1
NOTES: Before studying this command, you should already be familiar
with the GET and MAKETEXT commands.
SEE ALSO: "Predefined Data Types"
When reading a binary file, you frequently need to read numeric values then
convert them to text. For example:
GET x "WORD" <-- Read a two-byte number from the file
MAKETEXT y x "WORD" <-- Convert it into text form
You can do both of these operations at once, using the GETTEXT command:
GETTEXT y "WORD"
This reads a "WORD" (two binary bytes) from the input file, and then
converts it into text (e.g. "1234").
You only need to use value2 if you are converting a number to a text-based
data type such as "DATE". For example:
ZERODATE "1936" "1" "1" <-- Set "day zero"
GETTEXT date "LONGINT" "DATE Y/M/?d" <-- Get and convert a date
The GETTEXT command is also helpful if you are reading text data from a
fixed-length field, but it is padded with spaces or nulls:
GETTEXT x "80" "TRIMMED"
This reads in 80 characters, then removes tabs, spaces and nulls from
either end of the string.
--------------------
The READNEXT Command
--------------------
FORMAT: READNEXT [value [comparator] value]
PURPOSE: The READNEXT command gets the next line of the input file
(in other words, it replaces the current $FLINE), while
maintaining your place in the POM file.
NOTES: For an explanation of comparators, see "Using Comparators".
SEE ALSO: "The MinLen Command" and "Line Counters"
READNEXT is helpful if you know for certain what type of information the
next line will contain. Here is an example:
SET note = ""
SET customer = $FLINE[1 20]
BEGIN $FLINE ^ "See note below"
READNEXT
SET note = $FLINE[1 20]
END
OUTEND |{customer} {note}
If the input line contains the words "See note below", Parse-O-Matic will
read the next line of the input file (replacing the current $FLINE), thus
obtaining the comment about the customer.
End of File Conditions
----------------------
If you do a READNEXT at the end of the input file, READNEXT will set $FLINE
to null (""). The POM file will continue processing.
Optional Comparisons
--------------------
READNEXT can make a comparison. This is useful for skipping extraneous
lines of input. For example:
READNEXT $FLINE[1 5] = "NOTE:"
This obtains the next input line if the current input line starts with
"NOTE:".
Ignoring Null Lines
-------------------
By default, READNEXT will read null lines from the input file. If you want
it to ignore null lines, you can use an optional parameter of the MINLEN
command to specify a minimum length for the READNEXT command. For details,
see "The MinLen Command".
If you are reading a DBF (DBase) file, you can not "ignore null lines",
because the data is not in line format. In such case, you must check a
particular field to see if it is null. (See "DBF Files")
If you are using the CHOP or SPLIT commands, it may not be particularly
useful to "ignore null lines", since by definition you are requesting a
particular number of bytes each time the input is read. Nevertheless,
if you do a READNEXT at the end of the input file, READNEXT will set $FLINE
to null (""), and continue processing the POM file.
Saving the Previous Line
------------------------
When you do a READNEXT, there is no way to return to the previous line of
the input file. If you need it for other work, you should save a copy:
SET note = ""
SET customer = $FLINE[1 20]
SET saveline = $FLINE
BEGIN $FLINE ^ "See note below"
READNEXT
SET note = $FLINE[1 20]
END
SET custnum = saveline[22 25]
OUTEND |{custnum} {customer} {note}
The example above is not very efficient; it would make more sense to
extract custnum BEFORE you use READNEXT. However, in some instances you may
find it more convenient to save $FLINE before doing a READNEXT.
===========================================================================
INPUT FILTERS
===========================================================================
------------------
The MINLEN Command
------------------
FORMAT: MINLEN value1 [value2]
PURPOSE: MINLEN specifies the minimum length an input line must be to
be considered for parsing.
PARAMETERS: value1 is the minimum input line length
value2 is the minimum length for a READNEXT command
NUMERICS: Tabs, spaces and commas are stripped from value1 and value2
DEFAULTS: value2 = "0"
SEE ALSO: "The ReadNext Command"
If you omit the MINLEN command, the minimum length is assumed to be 1.
That is to say, all lines 1 character or longer will be processed and
shorter lines (null lines in other words) will be ignored.
MINLEN is useful for ignoring brief information lines that clutter up a
report that you are parsing. For example, in the sample file EXAMPL02.POM,
the MINLEN command is set to 85 to ensure that all lines shorter than 85
characters long will be ignored. This simplifies the coding considerably.
The longest allowable input line is 255 characters, unless you use the
SPLIT or CHOP command (see "The Split Command" and "The Chop Command").
The optional setting value2 specifies the minimum length for a READNEXT
command. If omitted, this value is assumed to be "0", meaning that
READNEXT will, by default, read null lines. If you set value2 to "1",
READNEXT will keep reading until it finds an input line of 1 or more
characters, or hits the end of file. The value2 setting has no effect
if you are reading a DBF (DBase) file.
------------------
The IGNORE Command
------------------
FORMAT: IGNORE value1 [comparator] value2
PURPOSE: When the comparison is true, the input line is ignored and
all further processing on the input line stops.
NOTES: For an explanation of comparators, see "Using Comparators".
ALTERNATIVES: The ACCEPT and BEGIN commands.
Here is a typical application of the IGNORE command:
IGNORE $FLINE[3 9] ^ "Date"
This skips any input line that contains the word "Date" between columns 3
and 9 ($FLINE is the line just read from the input file).
------------------
The ACCEPT Command
------------------
FORMAT: ACCEPT value1 [comparator] value2
PURPOSE: The ACCEPT command accepts the input line if the comparison
is true. value2. ACCEPT commands can be "clustered" to
allow a series of related tests.
NOTES: For an explanation of comparators, see "Using Comparators".
In the following explanation, we will demonstrate the
command using only the "literally identical" ("=")
comparator.
ALTERNATIVES: The IGNORE command.
If the entire POM file reads as follows:
ACCEPT $FLINE[15 17] = "YES"
OUTEND "X" = "X" |{$FLINE}
then any input line that contains "YES" starting in column 15 is sent to
the output file. All other lines are ignored.
Clustered Accepts
-----------------
Sometimes you have to check more than one value to see if the input line is
valid. You do this by using "clustered ACCEPTs", which are several ACCEPT
commands in a row.
Briefly stated, if you have several ACCEPTs in a row ("clustered"), they
are all processed to determine if the input line is acceptable or not. If
even one ACCEPT matches up, the line is accepted. To express this in more
detail...
When the comparison is true, the line is accepted, and processing of the
POM file continues for that input line, even if the immediately following
ACCEPTs do NOT produce a match. After all, we've already got a match!
If value1 does NOT contain value2, Parse-O-Matic looks at the next commmand
in the POM file. If it is not another ACCEPT, the input line is ignored.
If it is another ACCEPT, maybe it will product a match -- so Parse-O-Matic
moves to that command.
The following POM file uses clustered ACCEPTs to accept any line that
contains the name "FRED" or "MARY" between columns 5 and 8, or contains the
word "MEMBER" between columns 20 and 25.
SET NAME = $FLINE[5 8] <-- Set the variable
ACCEPT NAME = "FRED" <-- Look for FRED
ACCEPT NAME = "MARY" <-- Look for MARY
ACCEPT $FLINE[20 25] = "MEMBER" <-- Look for MEMBER
OUTEND "X" = "X" |{$FLINE} <-- Output the line if we get this far
The following example will NOT work, however:
ACCEPT $FLINE[20 25] = "MEMBER"
SET NAME = $FLINE[5 8]
ACCEPT NAME = "FRED"
ACCEPT NAME = "MARY"
OUTEND "X" = "X" |{$FLINE}
It will not work because the ACCEPTs are not clustered; if the first ACCEPT
fails, the input line is rejected as soon as the SET command is
encountered. The next two ACCEPTs are not reached in such case.
===========================================================================
FLOW CONTROL COMMANDS
===========================================================================
-----------------
The BEGIN Command
-----------------
FORMAT: The basic format for the BEGIN command is as follows:
BEGIN value1 [comparator] value2
:
Dependant code
:
END
PURPOSE: If the comparison is true (e.g. value1 equals value2), then
the dependant code (the POM lines between the BEGIN and the
END) are executed. If the comparison is false, then the
dependant code is skipped.
NOTES: For an explanation of comparators, see "Using Comparators".
In the following explanation, we will demonstrate the
command using only the "literally identical" ("=")
comparator.
SEE ALSO: "The Else Command" and "The Again Command"
It is traditional in programming to indent code that appears in blocks
such as Parse-O-Matic's BEGIN/END technique. This makes the logic of
the POM file easier for us to understand. For example:
BEGIN datatype = "Employee"
SET phone = $FLINE[ 1 10]
SET address = $FLINE[12 31]
END
BEGIN/END blocks can be nested. That is to say, you can have BEGIN/END
blocks inside other BEGIN/END blocks. Here is an example, with arrows
to indicate the levels of each BEGIN/END block...
BEGIN datatype = "Employee" <---------------------
SET phone = $FLINE[ 1 10] |
SET address = $FLINE[12 31] |
SET areacode = phone[1 3] | First
BEGIN areacode = "514" <------- Second | Level
SET local = "Y" | Level | Block
SET tax = "Y" <------- Block |
END |
END <---------------------
In this case, the "inner" block (starting with BEGIN areacode = "514") is
reached only if the "outer" block (BEGIN datatype = "Employee") is true.
If the outer block is false, the inner block is ignored.
A nested BEGIN/END block must always be completely inside the outer block.
Study the following (incorrect) example:
BEGIN datatype = "Employee" <----
SET phone = $FLINE[ 1 10] | First
SET areacode = phone[1 3] | Level
BEGIN areacode = "514" <--- | Block?
SET local = "Y" | |
END | <----
SET tax = "Y" |
END <--- Second Level Block?
Parse-O-Matic does not pay attention to the indenting -- it is only a
tradition we use to make the file easier to read. The code will be
understood this way:
BEGIN datatype = "Employee" <---------------------
SET phone = $FLINE[ 1 10] | First
SET areacode = phone[1 3] | Level
BEGIN areacode = "514" <--- Second | Block
SET local = "Y" | Level |
END <--- Block |
SET tax = "Y" |
END <---------------------
You can nest BEGIN/END blocks up to 25 deep -- although it is unlikely you
will ever need that much nesting. Here is an example of code that uses
nesting up to three deep:
BEGIN datatype = "Dog" <----------------------------------
SET breed = $FLINE[1 10] | First
BEGIN breed = "Collie" <----------------------- | Level
SET noise = "Woof" | Second | Block
BEGIN name = "Spot" <------ Third | Level |
SET attitude = "Friendly" | Level | Block |
END <------ Block | |
END <----------------------- |
BEGIN breed = "Other" <----------------------- Another |
SET noise = "Arf" | Second |
SET attitude = "Unknown" | Level |
END <----------------------- Block |
END <----------------------------------
Once again, the indentation is for clarity only and does not affect the
way the POM file runs. However, you will find that it makes your POM
file much easier to understand.
----------------
The ELSE Command
----------------
FORMAT: The format of a BEGIN/ELSE/END block is as follows:
BEGIN value1 [comparator] value2
:
Code that is run if the comparison is true
:
ELSE
:
Code that is run if the comparison is false
:
END
PURPOSE: The ELSE command tells Parse-O-Matic to execute the
following block of code (up until the END command) if the
corresponding BEGIN comparison is NOT true.
NOTES: The ELSE command is not the same as the ELSE used to pad the
IF statement (e.g. IF xyz = "3" THEN x = "Y" ELSE "N"). In
the IF command, ELSE makes the statement more clear, but it
can be omitted (e.g. IF $FLINE[1] "3" x "Y" "N").
Here is an example of a BEGIN/ELSE/END block:
BEGIN $FLINE[1 10] = "JOHN SMITH"
SET x = "This is John"
ELSE
SET x = "This is not John"
END
If you are using several levels of nesting, it is a good idea to indent
your code to show the relationship of the BEGIN, ELSE and END statements:
BEGIN datatype = "Dog" <----------------------------------
SET breed = $FLINE[1 10] | First
BEGIN breed = "Collie" <----------------------- | Level
SET noise = "Woof" | Second | Block
BEGIN name = "Spot" <------ Third | Level |
SET attitude = "Friendly" | Level | Block |
END <------ Block | |
ELSE | |
SET noise = "Arf" | |
SET attitude = "Unknown" | |
END <----------------------- |
END <----------------------------------
The ELSE is at "Level 2". This is because there are three BEGINs ahead of
it, but only one END (3 - 1 = 2).
---------------
The END Command
---------------
FORMAT: END
PURPOSE: Marks the end of a BEGIN, PROLOGUE or EPILOGUE code block.
The END command marks the end of a "code block". A code block is a series
of lines in a POM file that may be run if the conditions are right.
For a more detailed discussion of the END command, see the following
sections:
- "The Begin Command"
- "The Prologue Command"
- "The Epilogue Command"
-----------------
The AGAIN Command
-----------------
FORMAT #1: BEGIN [value1 [comparator] value2]
:
Code executed if the BEGIN comparison is true or omitted
:
AGAIN [value1 [comparator] value2]
FORMAT #2: BEGIN value1 [comparator] value2
:
Code executed if the BEGIN comparison is true
:
ELSE
:
Code executed if the BEGIN comparison is false
:
AGAIN [value1 [comparator] value2]
PURPOSE: Controls the repetition of a BEGIN block.
NOTES: For an explanation of comparators, see "Using Comparators".
SEE ALSO: "The ReadNext Command", "The Begin Command", "The Else
Command" and "Uninitialized and Persistent Variables"
DEFAULTS: If the comparison part of the AGAIN command is omitted:
- AGAIN repeats if the BEGIN comparison was true or omitted
- AGAIN does not repeat if the BEGIN comparison was false
ADVISORY: If you are familiar with other computer languages, you may
be tempted to use AGAIN to create loops when none are
required. Remember that a POM file is repeated (i.e.
looped) each time a record or line is read from the input
file. The AGAIN command is most appropriate when you have
input records with a variable number of items.
The AGAIN command allows you to implement "loops". A loop is a section of
code that can be repeated one or more times.
AGAIN returns to the corresponding BEGIN if the comparison is true, or if
it is omitted. Since the BEGIN can also have a comparison, and can be used
in conjunction with the ELSE command, this allows many variations:
COMMAND ARRANGEMENT EFFECT
------------------------------ -----------------------------------------
BEGIN AGAIN Loops forever
BEGIN comp AGAIN Loops until the BEGIN comparison is false
BEGIN AGAIN comp Loops until the AGAIN comparison is false
BEGIN comp AGAIN comp Loops until either comparison is false
BEGIN comp ELSE AGAIN comp Loops until either comparison is false
BEGIN comp ELSE AGAIN Loops until the BEGIN comparison is false
In the last two examples, the ELSE code is run when the BEGIN comparison is
false, then processing continues on the POM line after the AGAIN command.
When a BEGIN comparison is false, the comparison (if any) of the AGAIN
command is not evaluated.
To put it another way: the AGAIN comparison is considered only if the
BEGIN comparison is true or omitted.
Using AGAIN for Variable-Length Data
------------------------------------
Let us say you have a text file that contains the names of people belonging
to various clubs. The file lists the name of the club, then the number of
people in each club, and then the names:
Chess Club
3
John Smith
Mary Jones
Fred Williams
Hopscotch Club
0
Tennis Club
2
Jack Martin
Debbie Harris
You could process this input file with the following POM file:
PAD $FLINE "R" " " "17" <-- Pad the club name out with spaces
OUT |{$FLINE} <-- Send the club name to the output file
READNEXT <-- Get the number of members
SET members = $FLINE <-- Remember this number
BEGIN members = "0" <-- Check if we have any members
OUT |(None) <-- Report if we have no members
ELSE <-- If we have members, do the next part
SET count = "0" <-- Initialize a counter
BEGIN <-- Start the loop
READNEXT <-- Get the person's name
SET count = count+ <-- Count this person
OUT |{$FLINE} <-- Send the name to the file
OUT count #< members |/ <-- Add a separator if not the last name
AGAIN count #< members <-- Go back if we have more members
END <-- Corresponds to the first BEGIN
OUTEND | <-- Start a new line after each club
This POM file would generate the following output:
Chess Club John Smith/Mary Jones/Fred Williams
Hopscotch Club (None)
Tennis Club Jack Martin/Debbie Harris
Pointless Command Combinations
------------------------------
Some combinations of BEGIN, ELSE and AGAIN are pointless. The following
command arrangements contain code that is never run:
COMMAND ARRANGEMENT NOTE
------------------------------ ------------------------------------------
BEGIN ELSE AGAIN comp The ELSE portion is never executed
BEGIN ELSE AGAIN Loops forever; the ELSE portion never runs
Examples
--------
Either of these two POM files will read a text file and ignore any lines
that contain the words "COW":
Using the AGAIN Command Using the IGNORE Command
----------------------- ------------------------
BEGIN $FLINE ^ "COW" IGNORE $FLINE ^ "COW"
READNEXT OUTEND |{$FLINE}
AGAIN
OUTEND |{$FLINE}
The shorter POM file is more efficient, but the results would be the
roughly the same for both. Remember that a POM file is processed each
time Parse-O-Matic reads an input record (or line), so the second version
is, in effect, looping as many times as there are records in the file.
The following POM file will read one line from a file, then send the string
[6]123[7]123[6]123[7]123 to the output file:
SET z = "0"
BEGIN <-----------------------------
SET y = "5" | First
BEGIN y <> "7" <------------------ | Level
SET x = "0" | Second | (Outermost)
SET y = y+ | Level | Loop
APPEND s s "[" y "]" | Loop |
BEGIN x <> "3" <-------- | |
SET x = x+ | Third | |
APPEND s s x | Level | |
AGAIN <-------- | |
AGAIN <------------------ |
SET z = z+ |
AGAIN z <> "2" <-----------------------------
OUTEND |{s}
NEXTFILE
The third level (innermost) loop ... generates 123
The second level (middle) loop ..... generates [6]123[7]123
The first level (outermost) loop ... generates [6]123[7]123[6]123[7]123
The following POM file will read one line from a file, then send the
string "XXXY" to the output file:
SET z = "0" <-- Initialize a counter
SET s = "" <-- Initialize the string we will output
BEGIN z < "3" <-- Check if the counter has reached "3"
SET z = z+ <-- Add one to the counter
APPEND s s "X" <-- Add an "X" to the end of the output string
ELSE /__ The ELSE section is run when
APPEND s s "Y" \ the BEGIN comparison is false
AGAIN <-- Go back to the BEGIN
OUTEND |{s} <-- Continue here after the ELSE portion
NEXTFILE <-- Stop reading the input file
----------------
The DONE Command
----------------
FORMAT: DONE [value1 [comparator] value2]
PURPOSE: The DONE command will discontinue processing the POM file
and proceed to the next input line, whereupon the POM file
will restart at the top.
NOTES: For an explanation of comparators, see "Using Comparators".
In the following explanation, we will demonstrate the
command using only the "literally identical" ("=")
comparator.
ALTERNATIVES: The NEXTFILE, IGNORE and ACCEPT commands.
The DONE command is most useful when you have a long series of BEGIN/END
blocks which make a related comparison. For example:
SET salesrep = $FLINE[11 50]
SET region = $FLINE[ 1 2]
BEGIN region = "US"
OUTEND |Sales representative for U.S.A.: {salesrep}
DONE
END
BEGIN region = "CN"
OUTEND |Sales representative for Canada: {salesrep}
DONE
END
BEGIN region = "EU"
OUTEND |Sales representative for Europe: {salesrep}
DONE
END
:
etc.
As you can see, if one of the BEGIN comparisons is true, all of the
following ones will inevitably be false. Rather than processing all the
others, you can use the DONE command to bail out and get ready for the
next input line.
The DONE command provides two benefits:
- It can speed up processing slightly
- It makes full traces easier to understand
For an explanation of traces, see the section entitled "Tracing".
Unless you use a comparison (explained later), the DONE command is useful
only inside BEGIN/ELSE/END blocks. If you write a POM file like this:
SET custnum = $FLINE[ 1 10]
SET custname = $FLINE[11 50]
DONE
OUTEND |{custname} {custnum}
then the OUTEND statement will NEVER be reached.
Here is how you specify a comparison for the DONE command:
DONE $FLINE = "End of Data"
This discontinues the POM file, and proceeds to the next input line, if the
current input line ($FLINE) is "End of Data".
--------------------
The NEXTFILE Command
--------------------
FORMAT: NEXTFILE [value [comparator] value]
PURPOSE: NEXTFILE discontinues processing the current input file and
proceeds to the next one, restarting the POM file from the
top.
NOTES: For an explanation of comparators, see "Using Comparators".
In the following explanation, we will demonstrate the
command using only the "literally identical" ("=")
comparator.
ALTERNATIVES: The HALT command.
The NEXTFILE command is useful when you process multiple input files (see
"POM and Wildcards"). Here is an example, which we will call TEST.POM:
BEGIN $FLINE = "End of Data"
OUTEND |{numlines} lines of data printed
SET numlines = ""
NEXTFILE
END
SET numlines = numlines+
OUTEND |{$FLINE}
Let's say you have three text files: DATA1.XYZ, DATA2.XYZ and DATA3.XYZ.
The last line of each file says "End of Data". You could copy all three
files to the file OUTPUT.TXT with this command:
POM TEST.POM DATA?.XYZ OUTPUT.TXT
This would copy the data from each file, but when it gets to the line
reading "End of Data", it records the number of lines of data that were
printed. Any lines after the "End of Data" line are skipped, because of
the NEXTFILE command.
The NEXTFILE command can specify a comparison. Here is an example:
NEXTFILE $FLINE = "End of Data"
OUTEND |{$FLINE}
Assuming the same input files (DATA1.XYZ etc.), and using the same POM
command as last time, this POM file would simply copy up to (but not
including" the line that reads "End of Data" in each input file.
----------------
The HALT Command
----------------
FORMAT: HALT value1 comparison value2 value3 [value4]
PURPOSE: The HALT command will terminate Parse-O-Matic processing if
the comparison is true.
PARAMETERS: value1 is any value
value2 is any value
value3 is the message to be displayed
value4 is the optional error level (between 100 to 199)
NUMERICS: Tabs, spaces and commas are stripped from value4
ALTERNATIVES: The NEXTFILE command.
SEE ALSO: "The MsgWait Command"
Here is an example of the HALT command:
HALT sales = "0" "Zero sales!"
If the variable named sales is "0", Parse-O-Matic will display an error
box reading "Zero sales!" and terminate after you've pressed a key. A copy
of the message is also placed in the processing log POMLOG.TXT (see
"Logging").
When a HALT condition occurs, Parse-O-Matic terminates with a DOS error
level of 100. You can specify a different value, using value4. This is
useful if you are calling Parse-O-Matic from a batch file or application
program and want to handle different errors in different ways.
You can set value4 to any number between 100 and 199. Consider these
examples:
HALT sales = "0" "Zero sales" "150"
HALT sales[1] = "-" "Negative sales" "160"
This terminates Parse-O-Matic with an error level of 150 if sales are zero.
If the first character of sales is a minus sign, Parse-O-Matic terminates
with an error level of 160.
When coding batch files, remember that the IF ERRORLEVEL command is
considered "True" if the error is the specified value or higher. This
means you should always test the higher value first. See your DOS manual
for details.
--------------------
The PROLOGUE Command
--------------------
FORMAT: The format for PROLOGUE (used in conjunction with the END
command) is as follows:
PROLOGUE
:
Dependant code
:
END
PURPOSE: PROLOGUE defines dependant code which is run before the
first line of the input file is read.
SEE ALSO: "The Epilogue Command"
PROLOGUE can be used to set up some variables, or set up a heading --
anything you only want to do once per input file, at the very start.
Here is an example of the PROLOGUE command:
PROLOGUE
SET both = "B"
SET space = " "
END
SET firstname = $FLINE[ 1 10]
SET lastname = $FLINE[15 25]
TRIM firstname both space
TRIM lastname both space
OUTEND |{firstname} {lastname}
When the input file is first opened, the PROLOGUE section sets the
variables "both" and "space". Once they're set, you don't have to change
them (since you're just using them to make the code easier to read). Thus,
it makes sense to set them only at the beginning of processing and not
bother setting them each time the POM file is executed (i.e. each time an
input line is read).
If you are working with multiple files (see "POM and Wildcards"), the
PROLOGUE is run for each input file. If you want to run some code for
the first file only, you can set a "flag", as in this example:
BEGIN firstfile = ""
SET firstfile = "N"
OUTEND |First file only
ELSE
OUTEND |Subsequent files
END
NEXTFILE
If you run this POM file on several files at once, using wildcards, the
first line of the output file will contain the words "First file only",
since the variable "firstfile" has not yet been assigned a value. On
subsequent files, the variable will have the value "N", so the following
lines of the output file will read "Subsequent files".
--------------------
The EPILOGUE Command
--------------------
FORMAT: The format for EPILOGUE (used in conjunction with the END
command) is as follows:
EPILOGUE
:
Dependant code
:
END
PURPOSE: EPILOGUE defines dependant code which is run after the last
line of the input file is read and the POM file is executed
to process it. In other words, once all the input data is
finished, the POM file runs one last time -- but only the
code in the EPILOGUE section.
SEE ALSO: "The Prologue Command"
You can use EPILOGUE to output final results. Let's say your input file
looks like this:
DESCRIPTION UNITS SOLD UNIT PRICE
Wildebeest food 325 $ 9.99
Horse cologne 13 $ 3.25
Moose alarm 210 $ 5.95
: : : : : (Column positions)
1 18 27 33 41
You can find out the total number of units sold (of all types) with the
following POM file:
IGNORE $FLINE[1 7] = "DESCRIP"
CALC units = units "+" $FLINE[18 27]
EPILOGUE
OUTEND |Total units sold = {units}
END
This POM file adds up the number of units sold. The only output is the
single line generated by the OUTEND in the EPILOGUE.
If you are processing multiple files (see "POM and Wildcards"), the
EPILOGUE is run after each input file is finished.
===========================================================================
VARIABLE MODIFIERS
===========================================================================
----------------
The TRIM Command
----------------
FORMAT: TRIM var1 value1 value2
PURPOSE: TRIM removes the character in value2 from var1.
PARAMETERS: var1 is the variable being set
value1 is "A" = All; "B" = Both ends;
"L" = Left side only; "R" = Right side only
value2 is the character to be removed.
ALTERNATIVES: The CHANGE command.
TRIM is usually used to remove blanks from either side of text, or leading
zeros from numeric data.
For example:
SET PRICE = $FLINE[20 26]
TRIM PRICE "A" ","
TRIM PRICE "L" "$"
This removes all commas from the variable "PRICE", and removes the leading
dollar sign. Thus:
If the input contains the string: "$25,783"
The first TRIM changes it to: "$25783"
The second TRIM changes it to: "25783"
---------------
The PAD Command
---------------
FORMAT: PAD var1 value1 value2 value3
: : : :
MEANING: Variable Control Char Number
PURPOSE: PAD makes var1 a specified length, padded with a
specified character.
PARAMETERS: var1 is the variable being set
value1 is "L", "R", or "C" (Left, Right or Center)
value2 is the character used to pad the string
value3 is the desired string length
NUMERICS: Tabs, spaces and commas are stripped from value3
ALTERNATIVES: The CHANGE command.
Here is an example of the PAD command. If the variable ABC is already set
to "1234" ...
PAD ABC "L" "0" "7" left-pads it 7 characters wide with zeros ("0001234")
PAD ABC "R" " " "5" right-pads it 5 characters wide with spaces ("1234 ")
PAD ABC "C" "*" "8" centers it, 8 wide, with asterisks ("**1234**")
If the length is less than the length of the string, it is unchanged. For
example, if you set variable XYZ to "PINNACLE", then
PAD XYZ "R" " " "3"
leaves the string as-is ("PINNACLE").
Thus, PAD can not be used to shorten a string. If it is your intention to
make XYZ 3 letters long, you can use the SET command:
SET XYZ = XYZ[1 3]
------------------
The CHANGE Command
------------------
FORMAT: CHANGE var1 value1 value2
PURPOSE: The CHANGE command replaces ALL occurrences of value1
with value2.
ALTERNATIVES: The TRIM command. (The CHANGE command is more powerful than
TRIM, but is not as efficient).
Here is an example of the CHANGE command in action:
SET DATE = $FLINE[31 38]
CHANGE DATE "/" "--"
If the SET command assigns DATE the value: "93/10/15"
Then the CHANGE command converts it to: "93--10--15"
-------------------
The CVTCASE Command
-------------------
FORMAT: CVTCASE var1 value1 [value2]
PURPOSE: CVTCASE converts a value to uppercase or lowercase.
PARAMETERS: var1 is the variable being set
value1 is the value being converted
value2 is the optional control setting
DEFAULTS: value2 = "UI"
ALTERNATIVES: The PROPER or REMAP commands; $FLUPC
CVTCASE converts value1 to uppercase or lowercase and places the result in
var1. Here are some examples:
COMMAND DESCRIPTION
--------------------------- --------------------------------
CVTCASE xyz "Test Case" "U" Sets variable xyz to "TEST CASE"
CVTCASE xyz "Test Case" "L" Sets variable xyz to "test case"
CVTCASE xyz "Test Case" Sets variable xyz to "TEST CASE"
In the last example, the optional control parameter (value2) was omitted.
In such case, CVTCASE will convert the value to uppercase.
Control Settings
----------------
The control setting (value2) can be one or two characters long, or it can
be omitted (in which case it is assumed to be "UI"). Here are the
available settings for value2:
SETTING CONVERT TO CHARACTER SET
------- ---------- ------------------
"L" Lowercase IBM Extended ASCII
"LI" Lowercase IBM Extended ASCII
"L7" Lowercase 7-bit ASCII
"U" Uppercase IBM Extended ASCII
"UI" Uppercase IBM Extended ASCII
"U7" Uppercase 7-bit ASCII
The IBM Extended ASCII character set defines diacritical (accented)
characters such as "U Umlaut" and "C Cedille"; these are located in the
ASCII table above value 127. It is the standard character set used by
MS-DOS and PC-DOS.
The 7-bit ASCII character set is concerned only with the characters in the
original definition of ASCII (American Standard Code for Information
Interchange), which does not support diacritical characters. As such,
uppercasing and lowercasing affect only alphabetic characters ("A" to "Z",
and "a" to "z"). This character set is used by many mini-computers, and
is the standard character set of the Unix operating system.
The eighth bit is not ignored if you use CVTCASE with the 7-bit ASCII
character set. If you wish to set the eighth bit to zero (perhaps because
it is a parity bit), you should use the REMAP command.
------------------
The PROPER Command
------------------
FORMAT: PROPER var1 [value1 [value2]]
PURPOSE: The PROPER command converts uppercase text (LIKE THIS) to
mixed-case text (Like This).
PARAMETERS: var1 is the variable being set
value1 is the methods setting
value2 is the name of the Properization Exception File
DEFAULTS: value1 = "IW"
ALTERNATIVES: The CHANGE command; $FLUPC (uppercase version of $FLINE).
The PROPER command is useful when you have a list of names of people and
addresses. You can also use PROPER to change text that has been typed in
uppercase into normal text, with capital letters at the beginning of
sentences.
The simplest way to convert a variable is as follows:
PROPER CustName
If CustName contains "JOHN SMITH", it will be changed to "John Smith".
The conversion routine is fairly intelligent. For example, if it is
converting the words "JAGUAR XJS", it can tell that XJS is not a word
(since it does not contain any vowels) and so the the end result will
be "Jaguar XJS". Other "strange-looking" items such as serial numbers
can often be recognized by the PROPER command, and left untouched.
Nevertheless, it is impossible to handle all situations, so the PROPER
command supports a "Properization Exceptions File" (known as a PEF file).
A PEF file lists unusual combinations of letters (typically abbreviations,
such as Dr.). The Parse-O-Matic package includes a file named GENERIC.PEF,
which you may find helpful. You can view it with the SEE program provided
with Parse-O-Matic.
A PEF file is prepared with a text editor and contains one "exception" per
line. Null or blank lines, or lines that start with a semicolon, are
ignored. The longest word that can be specified is 255 characters.
Spaces are permitted, but leading and trailing spaces and tabs are ignored.
To use the PEF file in your PROPER command, place the file name after the
variable name and method setting. For example:
PROPER CustName "W" "GENERIC.PEF"
The "W" is the method setting (explained later). "GENERIC.PEF" is the name
of the PEF file. When Parse-O-Matic looks for the PEF file, it looks for
it in the current directory unless an explicit path is specified, then
searches elsewhere, if necessary. (For details, see the section entitled
"How Parse-O-Matic Searches for a File".)
If it can not find it there, it looks in the
directory where POM.EXE is located. You can, if you wish, specify a
complete path to the file, as in this example:
PROPER Address "W" "C:\MYFILES\MYPEF.XYZ"
If you don't need an exceptions file, you should not use it, since it slows
down processing somewhat. Needless to say, the more items you have in the
PEF file, the more it slows down processing.
The method setting allows you to specify what PROPER does. There are
several kinds of controls, as follows:
METHOD DESCRIPTION
------ -----------
I Intelligent determination of non-words
S Upcase the first character of each sentence
U Upcase the first alphanumeric character of the line
W Upcase the first letter of each word
The default method setting is "IW", so if you omit the method setting, or
specify a null setting (e.g. PROPER CustName "" "XYZ.PEF"), PROPER will
upcase non-words, and the first letter of each word.
NOTE: If you specify a PEF file, you must also specify a method setting,
even if it is null. The line PROPER "GENERIC.PEF" would not be understood
by Parse-O-Matic. The correct format would be: PROPER "" "GENERIC.PEF"
The examples provided with Parse-O-Matic demonstrate some ways you can use
the PROPER command. To see the examples, enter START at the DOS prompt,
or run START.BAT from Windows or OS/2, then select TUTORIAL.
------------------
The INSERT Command
------------------
FORMAT: INSERT var1 value1 value2
PURPOSE: The INSERT command inserts text on the left or right of
var1, or at a "found text" position.
PARAMETERS: var1 is the variable being set
value1 is "L" or "R" (Left or Right) or a find-string
(e.g. "<HELLO")
value2 is the value to be inserted
ALTERNATIVES: The APPEND and CHANGE commands.
For example, if the variable ABC is set to "Parse-O-Matic", then
INSERT ABC "L" "Register " sets ABC to "Register Parse-O-Matic"
INSERT ABC "R" " is super" sets ABC to "Parse-O-Matic is super"
You can use a find-string to insert text either before or after the first
occurrence of the text you specify. For example, if the variable xyz is
set to "One a day", then
INSERT xyz "<e" "c" sets xyz to "Once a day"
INSERT xyz ">One " "hour " sets xyz to "One hour a day"
The < prefix means "insert value1 before the found text". The > prefix
means "insert value1 after the found text".
If the find-string is not found, nothing is done.
NOTE: Prior to version 3.40 of Parse-O-Matic, the "insert before" opera-
tion was denoted by the @ prefix rather than the < prefix. This
still works, so you do not have to change your POM files.
------------------
The APPEND Command
------------------
FORMAT: APPEND var1 value1 value2 [value3 [value4]]
PURPOSE: The APPEND command concatenates (adds together) two or more
values and places the result in var1.
NOTES: No variable can hold more than 255 characters.
ALTERNATIVES: The INSERT command.
Here is an example of the APPEND command:
APPEND xyz "AB" "CD" "EF" "GHIJ"
This command sets the variable xyz to "ABCDEFGHIJ".
The third and fourth values (value3 and value4 in the FORMAT shown above)
are optional. Thus, you can use APPEND with only two values. For example:
SET x1 = "AB"
SET x2 = "CD"
APPEND x3 x1 x2
This sets the variable x3 to "ABCD". You can concatenate a maximum of four
values with a single APPEND command. If you require additional concaten-
ations, you can use more APPEND commands:
APPEND myvar "ABC" "DEF" "GHI" "JKL"
APPEND myvar myvar "MNO" "PQR"
The first line sets the variable myvar to "ABCDEFGHIJKL". The second line
set myvar to its previous value, plus "MNOPQR", so that its final value is
"ABCDEFGHIJKLMNOPQR".
-------------------
The MAPFILE Command
-------------------
FORMAT: value1 value2 [value3]
PURPOSE: MAPFILE reads a file containing data for the REMAP command
PARAMETERS: value1 is the name of the map file
value2 is the map name, used by the REMAP command
value3 is the control settings (AnyCase/MatchCase/Transpose)
DEFAULTS: value3 = "MATCHCASE"
NOTES: The maximum length of a map name is 12 characters
SEE ALSO: "The Remap Command", "How Parse-O-Matic Searches for a File"
The MAPFILE command reads in a file which contains data for the REMAP
command, and assigns a name to the collection of data so the REMAP command
can refer to it.
What is a Map File?
-------------------
A map file is an ordinary text file; you can create or edit the file with a
standard text editor, or a word-processor in "generic text" mode. Map files
are usually given the .MPF extension.
A map file contains a list of "mappings". Here are some other words with
approximately the same meaning as "mapping":
Translation Correlation Substitution Equivalence Replacement
In other words, the map file contains a list of data items that should be
replaced by other data items.
Sample Map Files
----------------
The following map files are included in the standard Parse-O-Matic package:
FILE NAME DESCRIPTION
------------ ----------------------------------------------------------
BIN2CHAR.MPF Converts binary data into printable characters and periods
BIN2CODE.MPF Converts binary data into hex codes (e.g. 3F 2C A3)
ASC2EBCD.MPF Converts ASCII data to EBCDIC and vice-versa
Map File Format
---------------
The map file contains one mapping per line. Each mapping consists of two
Parse-O-Matic literals, separated by one or more spaces or tabs. The first
value is the "find" column, while the second value is the "replace" column.
Here are some examples:
"123" "LOS ANGELES" <-- Both values are "literal text strings"
$39 "9" <-- A hex code and a literal text string
#48 "Zero" <-- A decimal code and a literal text string
$FF$30 #00#00 <-- Hex and decimal literals
These columns are lined up for clarity; there is no need to start in a
particular column. Any leading or trailing spaces are removed from a
line, and any number of spaces or tabs can appear between the columns.
The following line is NOT valid:
123 LOS ANGELES
The line must use text, hex or decimal literals (e.g. "text", $FF, #FF).
Null or blank lines, or lines that start with a semicolon, are ignored.
The longest line that can be specified is 255 characters. The longest
value that can be specified is 80 characters (after translation, if it
is in hex or decimal mode).
Search Order
------------
The REMAP command performs substitutions in the order that they appear in
the map file. In most cases, the longer "find" strings should appear
first. For example, let us say you create a map file named FORWARD.MPF,
which looks like this:
"123" "in Los Angeles"
"12" "in Montreal"
"1" "in Town of Mount Royal"
"2" "in Podunk"
Now let's say you run the following POM file, named FORWARD.POM:
MAPFILE "FORWARD"
SET comment = "Forward the memo to office 123"
REMAP comment
OUTEND |{comment}
This will produce the following output:
Forward the memo to office in Los Angeles
This happens because the string "123" is replaced by the string "in Los
Angeles".
If the order of the lines in FORWARD.MPF are reversed, FORWARD.POM will
produce the following output:
Forward memo to Office in Town of Mount RoyalinPodunk3
This happens because the "1" is found (and replaced) first, followed by the
"2". Since there is no "3" in FORWARD.MPF, it is left alone.
Parse-O-Matic does NOT enforce the principle of "progressively shorter
'find' strings". If you are processing a lot of data, you can improve
processing speed slightly by placing a short, frequently-used "find" string
near the top of the list. As long as it is not a sub-string of (i.e.
contained within) one of the following strings, it will not cause any
problems.
Case Matching
-------------
You can set value3 to "AnyCase" or "MatchCase" (the default).
ANYCASE: The find string need not match in case ("John" = "JOHN")
MATCHCASE: The find string must match ("John" does not match "JOHN")
Processing is faster if you use the default setting (MatchCase).
Reverse Mapping
---------------
If you want the mapping process to work "backwards", you can use the
"Transpose" control setting in value3. For example:
MAPFILE "MYFILE.MPF" "MYFILE" "AnyCase Transpose"
This reverses the mapping process: the "find" column is treated like the
"replace with" column, and vice-versa.
The standard Parse-O-Matic package contains a map file (ASC2EBCD.MPF) which
will translate ASCII files into EBCDIC files -- and vice-versa.
NOTE: EBCDIC is a character representation used on certain large mainframe
computers. Both ASCII and EBCDIC characters are eight bits long,
but EBCDIC uses different bit patterns for most characters.
Since both the "find" and "replace with" columns in ASC2EBCD.MPF are only
one character wide, and since there is no duplication within either column,
the translation process is perfectly reversible. For example:
PROLOGUE
CHOP 1 20 <-- Read 20 bytes at a time
MAPFILE "ASC2EBCD.MPF" "EBCDIC" <-+
MAPFILE "ASC2EBCD.MPF" "ASCII" "TRANSPOSE" | Set up maps
MAPFILE "BIN2CODE.MPF" "CODE" <-+
END
SET x = $FLINE <-+
REMAP x "CODE" | Display original text in
OUTEND |[ORIGINAL] [{$FLINE}] | normal & hex-coded form
OUTEND |[ORIGINAL] [{x}] <-+
REMAP $FLINE "EBCDIC" <-+
SET x = $FLINE | Convert to EBCDIC and
REMAP x "CODE" | display in coded form
OUTEND |[EBCDIC ] [{x}] <-+
REMAP $FLINE "ASCII" <-+
SET x = $FLINE | Convert EBCDIC back to
REMAP x "CODE" | ASCII; display hex code
OUTEND |[ASCII ] [{x}] <-+
OUTEND | <-- Output a separator line
You can run this POM file against any file, then view the output file. You
will see how the original text is converted into EBCDIC and then the EBCDIC
is converted back to ASCII. (Most of the data in the output file is
represented in "hex dump" format, since your computer is not designed to
display EBCDIC.)
TRANSPOSE will often let you use a single map file instead of two, but
before using this technique you should carefully consider how mapping will
take place (see "Irreversible Mapping", below).
Irreversible Mapping
--------------------
Consider the following POM file:
MAPFILE "MYMAP.MPF" "XYZ"
MAPFILE "MYMAP.MPF" "ZYX" "TRANSPOSE"
REMAP $FLINE "XYZ"
REMAP $FLINE "ZYX"
OUTEND |{$FLINE}
In many cases, this is equivalent to the following one-line POM file:
OUTEND |{$FLINE}
because the first REMAP changes $FLINE one way, and the second REMAP
changes it back.
This is not true in ALL cases, however. In some circumstances a REMAP is
not reversible. Consider the following map file:
"XYZ" "CAB"
"ABC" "C"
"DEF" "ABC"
Now consider the following sequence of events. (The * and # characters
show what gets replaced in each step.)
Original string . . . . . . . . . ABCDEF
***###
*###
Remap produces this result . . . . CABC
***
***
Transposed remap of result . . . . XYZC
If you follow the steps of the substitutions, you will see where the
confusion arises. As a general rule, simple substitutions (with no
duplications in whole or in part) are reversible, but if you have any
doubts, you can always take the safe route and use a separate map file
for each direction. (See "Search Order", above, for additional insight
into this matter.)
Memory Limitations
------------------
The MAPFILE command reads the map data into RAM memory. You will normally
have sufficient memory for thousands of bytes worth of mappings. However,
if you do not have enough memory to hold the data, Parse-O-Matic will
display an error message, then terminate. (See "Solving Memory Problems")
To help you track memory usage, the MAPFILE command records memory status
(bytes used and bytes left) in the processing log (see "Logging").
An Example of Remapping
-----------------------
The standard Parse-O-Matic package contains two sample map files:
BIN2CODE.MPF maps single bytes to hex codes (e.g. Hex $31 becomes "31 ")
BIN2CHAR.MPF maps single bytes to either printable characters or periods
You can view these files with the SEE program (included with Parse-O-Matic)
or you can load them into a text editor program.
Here is a POM file that uses the sample map files to create a hex dump of a
binary file:
CHOP 1 16 <-- Read the file 16 bytes at a time
SETLEN w $FLINE <-- Get the actual number of bytes read
BEGIN w <> "16"
PAD $FLINE "R" #0 "16" <-- If less than 16 bytes, pad with nulls
END
SET x = $FLINE <-- Make a copy of $FLINE
SET y = $FLINE <-- Make a copy of $FLINE
MAPFILE "BIN2CHAR" "CHAR" <-- See Note
MAPFILE "BIN2CODE.MPF" "CODE"
REMAP x "CHAR" <-- Change the bytes to printable characters
REMAP y "CODE" <-- Change the bytes to hex codes
OUTEND |x y <-- Output the line
Note: Since the file name (value1) does not have an extension,
Parse-O-Matic will add the .MPF extension. Thus, the actual file
name Parse-O-Matic looks for is "BIN2CHAR.MPF".
-----------------
The REMAP Command
-----------------
FORMAT: REMAP var1 value1
PURPOSE: REMAP transforms sub-strings into other strings
PARAMETERS: var1 is the variable being transformed
value1 is the map name (see "The MapFile Command")
ALTERNATIVES: The LOOKUP and CHANGE commands.
SEE ALSO: "The MapFile Command"
The REMAP command performs intensive substitutions on a variable. It is
equivalent to a large number of CHANGE commands, but has the following
advantages:
- It is faster than using a large number of CHANGEs
- It does not expend your available values (see "Values")
- It prevents multiple substitutions
REMAP Versus CHANGE
-------------------
The "multiple substitution" issue is most important distinction between
CHANGE and REMAP. REMAP protects substituted text from being
resubstituted. Consider the following POM lines:
SET x = "cat dog mouse"
CHANGE x "cat" "dog"
CHANGE x "dog" "cat"
You might expect these lines to change x to "Dog Cat Mouse", but the actual
result is "cat cat mouse". The first CHANGE command sets the x variable to
"dog dog mouse". The next command changes the cats into dogs!
You can avoid this problem by using intermediate substitutions or some such
work-around, but this ends up complicating the POM file considerably.
Moreover, this approach can be unwieldy if you have to perform a large
number of substitutions.
Using REMAP
-----------
To accomplish the "cat/dog" substitution mentioned earlier, you can create
a map file (named CATDOG.MPF) with a text editor. It will look like this:
"cat" "dog"
"dog" "cat"
Your POM file will then look like this:
MAPFILE "CATDOG.MPF" "PETS"
SET x = "cat dog mouse"
REMAP x "PETS"
This will change the x variable to "dog cat mouse".
For another example of the REMAP command, see "The MapFile Command".
===========================================================================
FREE-FORM COMMANDS
===========================================================================
----------------------------
What are Free-Form Commands?
----------------------------
The free-form commands are used for extracting information from an input
line that does not have its data in precise columns. Consider the
following input file:
Mouse Gazelle Mouse Elephant
Dog Giraffe Elk Mongoose
Monkey Snake Caribou Trout
| | | |
Column 1 Col 11 Col 21 Col 31
Extracting data that is arranged in tidy columns is simple -- all you need
is the SET command. However, you will need a more powerful command if the
data is "free-form", like this:
Mouse,Gazelle,Mouse,Elephant
Dog,Giraffe,Elk,Mongoose
Monkey,Snake,Caribou,Trout
The data is not arranged in tidy columns. For tasks like this, you need
the free-form commands.
-----------------
The PARSE Command
-----------------
FORMAT: PARSE var1 value1 value2 value3 [value4]
: : : : :
MEANING: Variable Source From To Control
PURPOSE: PARSE sets var1 to the text (found in value1) between
text fragments specified by value2 and value3.
PARAMETERS: var1 is the variable being set
value1 is the source text being read
value2 specifies the starting position
value3 specifies the ending position
value4 is the optional control setting
DEFAULTS: value4 = "X"
ALTERNATIVES: The PEEL command, and COPY used with FINDPOSN.
Consider the the following free-form data:
Mouse,Gazelle,Mouse,Elephant
Dog,Giraffe,Elk,Mongoose
Monkey,Snake,Caribou,Trout
The PARSE command lets you extract the "Nth" item. For example, to extract
the third item in each line in the free-form example above, you could use
this command:
PARSE xyz $FLINE "2*," "3*,"
This means "set the variable xyz by looking in $FLINE (the line just read
from the input file) and taking everything between the second comma and the
third comma". For the three lines in the sample input file, the variable
xyz is set to Mouse, then Elk, then Caribou.
Decapsulators
-------------
In the "From" specification in the previous example (i.e. the "2*," part
of the command):
2 means "the second occurrence"
* is a delimiter to mark the end of the occurrence number
, is the text you are looking for
Both the "From" and "To" specifications use this format. Commands using
this format are said to use "decapsulators", because you are extracting
text that is encapsulated (i.e. surrounded) by other text.
Decapsulators may be used to find more than a single character. The
surrounding text can be up to 80 characters long. Let's say the input file
looks like this:
Mouse:::Gazelle:::Mouse:::Elephant
Dog:::Giraffe:::Elk:::Mongoose
Monkey:::Snake:::Caribou:::Trout
You can extract the third item in each line with this command:
PARSE xyz $FLINE "2*:::" "3*:::"
___ ______ _ ___ _ ___
| | | | | |
Variable to set | | | | |
The value to parse | | | "To" text being sought
"From" occurrence number | "To" occurrence number
"From" text being sought
This command sets the variable xyz to Mouse, then Elk, then Caribou.
Sample Application
------------------
The PARSE command is particularly useful for extracting information from
comma-delimited files. Here is an example of a comma-delimited file:
"Mouse","Gazelle","Mouse","Elephant"
"Dog","Giraffe","Elk","Mongoose"
"Monkey","Snake","Caribou","Trout"
You can extract all the fields with this series of commands (note the use
of doubled-up quotes to represent a single quotation mark -- see the
section "Delimiters" for details):
PARSE field1 $FLINE "1*""" "2*"""
PARSE field2 $FLINE "3*""" "4*"""
PARSE field3 $FLINE "5*""" "6*"""
PARSE field4 $FLINE "7*""" "8*"""
For the first line of the sample input file, field1 is set to Mouse, field2
is set to Gazelle, and so on.
The Occurence Number
--------------------
The occurrence number must be between 1 and 255. The following lines are
not valid PARSE commands:
PARSE xyz $FLINE "0*," "1*," <-- "From" decapsulator invalid: uses 0
PARSE xyz $FLINE "1*," "256*," <-- "To" decapsulator invalid: uses 256
The occurrence number must always be followed by a "*" so you can search
for a number. Consider the following example (the meaning of which would
be unclear without the "*" delimiter):
PARSE xyz "XXX2YYY2ZZZ2" "1*2" "2*2"
This sets xyz to the text occuring between the first "2" and the second
"2". In other words, xyz is set to "YYY".
Finding the Last Occurence
--------------------------
A decapsulator can refer to "the LAST occurence":
PARSE xyz "AaaBAbbBAccB" ">*A" ">*B"
In both decapsulators, the ">" symbol means "the last occurence". Thus,
the command tells Parse-O-Matic, "Set the xyz variable to everything
between the last A and the last B". This sets the xyz variable to "cc".
You can also use the "<" character to mean "the FIRST occurence", although
this is somewhat redundant, since the following commands are equivalent:
PARSE xyz "AaaBAbbBAccB" "<*A" "<*B"
PARSE xyz "AaaBAbbBAccB" "1*A" "1*B"
PARSE xyz "AaaBAbbBAccB" "A" "B"
All three commands set the xyz variable to "aa".
Unsuccessful Searches
---------------------
If PARSE does not find the search text, the variable will be set to a null
(""). Here are two examples:
PARSE abc "ABCDEFGHIJ" "1*K" "1*J" <-- There is no "K"
PARSE abc "ABCDEFGHIJ" "1*A" "1*X" <-- There is no "X"
If the "from" value is less than the "to" value, Parse-O-Matic will display
an error message, then terminate. For example:
PARSE abc "ABCDEFGHIJ" "1*J" "1*A" <-- "J" comes after "A"
This kind of failure typically happens if the input data contains an odd
arrangement of text that you had not foreseen.
The Control Setting
-------------------
The PARSE command has an optional "Control" parameter, which tells PARSE
whether to include or exclude the surrounding text that was found. By
default (as shown in all of the preceding examples), the delimiting text
is excluded. However, if you want to include it, you can add "I" at the
end of the PARSE command, as in this example:
PARSE xyz "aXcaYcaZc" "2*a" "2*c" "I"
This tells Parse-O-Matic to give you everything between the second "a" and
the second "c" -- including the "a" and "c". In other words, this sets the
variable xyz to "aYc". You can also set the Control specification to "X"
(meaning "exclude"), although since this is the default setting for PARSE,
it really isn't necessary. Here is an example:
PARSE xyz "a1ca2ca3c" "2*a" "2*c" "X"
This sets the variable xyz to "2".
The Plain Decapsulator
----------------------
The occurrence number is not always needed. Either the "From" or "To"
decapsulator can be represented as a plain string, as follows:
PARSE $FLINE "ABC" "XYZ"
This means:
- Start at the first "ABC" found in the value being parsed
- End with the first "XYZ" found in the value being parsed
The Null Decapsulator
---------------------
Here is helpful variation of the "From" decapsulator:
"" means "Start from the first character in the value being parsed"
A similar variation can be used with the "To" decapsulator:
"" means "End with the last character in the value being parsed"
If you use the null ("") decapsulator for "From" or "To", the "found" value
(the first character for "From", or the last character for "To") will
always be included (see "Overlapping Decapsulators" for the single
exception to this rule). Here is an example:
PARSE xyz "ABCABCABC" "" "2*C"
This sets the variable xyz to "ABCAB". The "From" value (i.e. the first
character) is NOT excluded. However, when PARSE finds the "To" value (i.e.
the second occurrence of the letter C) it IS excluded. If you want to
include the second "C", you should write the command this way:
PARSE xyz "ABCABCABC" "" "2*C" "I"
The following two commands accomplish the same thing:
PARSE xyz "ABCD" "" ""
SET xyz "ABCD"
They are equivalent because the PARSE command means "Set the variable xyz
with everything between (and including) the first and last character".
Null Decapsulators Versus Exclusion
-----------------------------------
The reason that PARSE treats the null ("") decapsulator differently may
not be immediately obvious, since the examples given here are very simple,
and not representative of "real world" applications. However, in day-to-day
usage, you will frequently find it helpful to be able to specify a command
that says, "Give me everything from the beginning of the line to just
before such-and-such".
Here is a command that means "Give me everything from just after the dollar
sign, to the end of the line":
PARSE xyz "I'd like to have $250.00" "1*$" ""
This sets xyz to "250.00". If you want to include the dollar sign, write
the command this way:
PARSE xyz "I'd like to have $250.00" "1*$" "" "I"
Overlapping Decapsulators
-------------------------
Earlier, it was mentioned that the text found by the null decapsulator is
"always included" and is not affected by the "X" (Exclude) control. There
is one exception to this: if the null decapsulator's "found text" is
contained in the text found by the other decapsulator, it WILL be affected.
For example:
PARSE x "ABCDEFABCDEF" "" "1*AB" "X"
This command tells Parse-O-Matic "give me everything between the first
character and the first occurence of AB". Since the two items overlap
(i.e. the first "AB" includes the first character), the first character
does indeed get excluded. As a result, the x variable is set to an empty
string ("").
Here is another example:
PARSE x "ABCDEFABCDEF" ">*F" "" "X"
This command tells Parse-O-Matic "give me everything between the last
occurence of F and the last character". Both decapsulators refer to the
same character (i.e. the final "F"), so it is excluded. As a result, the x
variable is set to an empty string ("").
NOTE: In some circumstances, the FINDPOSN command is NOT affected by this
exception. It will do its best to make sense of your request if the
decapsulators overlap, and one of them is a null decapsulator. For
details, see "The FindPosn Command".
Parsing Empty Fields
--------------------
Consider the following command:
PARSE x ",,,JOHN,SMITH" "2*," "3*,"
There is nothing between the second and third comma, so the x variable
is set to "" (an empty string).
Now consider this command:
PARSE x ",,,JOHN,SMITH" "" ","
You are asking for everything from the first character to the first
comma (which also happens to be the first character). Obviously, there is
nothing "between" the two characters, so the x variable would be set to ""
(an empty string).
Additional Examples
-------------------
For more examples of the PARSE command, see the demonstrations provided
with Parse-O-Matic (type START at the DOS prompt, or run START.BAT from
Windows or OS/2, then select TUTORIAL).
----------------
The PEEL Command
----------------
FORMAT: PEEL var1 var2 value1 value2 [value3]
: : : : :
MEANING: Variable Source From To Control
PURPOSE: The PEEL command works just like PARSE, but after setting
var1, it REMOVES the parsed value (including the delimiters)
from var2.
PARAMETERS: var1 is the variable being set
var2 is the source text being read
value1 specifies the starting position
value2 specifies the ending position
value3 is the optional control setting
DEFAULTS: value3 = "X"
When you are breaking up a complex line into fields, PEEL can simplify
matters considerably, because the line being interpreted gradually becomes
less complex.
Here is a simple example. Let's say you have an input file containing a
single line:
AA/BB/CC/DD
If you run this POM file against the input file:
PEEL x $FLINE "" "/" <-- Strip out the AA and remove the /
OUTEND |{x}
PEEL x $FLINE "" "/" <-- Strip out the BB and remove the /
OUTEND |{x}
PEEL x $FLINE "" "/" <-- Strip out the CC and remove the /
OUTEND |{x}
OUTEND |{$FLINE}
then the output file will look like this:
AA
BB
CC
DD
What is happening is that $FLINE is gradually being stripped of the text
that is being found. After the first PEEL, $FLINE contains "BB/CC/DD",
and so on. After the final PEEL, $FLINE only contains "DD".
The Control Setting
-------------------
The "I" and "X" control parameters behave the same way as they do in the
PARSE command: they specify whether or not the surrounding text is included
in var1. Take note, however, that the starting and ending characters
are always removed from var2, along with the "found" text, regardless of
the control parameter. In other words, the control parameter only affects
the first variable (x in the example above), not the second ($FLINE in the
example).
Parsing Empty Fields
--------------------
Consider the following commands:
SET z = ",,,JOHN,SMITH"
PEEL x z "2*," "3*,"
There is nothing between the second and third comma, so the x variable
is set to "" (an empty string). After the PEEL command, the z variable
will be two commas shorter (",JOHN,SMITH,23.00"). If you are trying to
extract data from a comma-delimited line, this is probably not what you
want (since it gets rid of two commas). When taking apart a delimited
file, it often makes sense to start peeling from the left side of the
string. Consider these commands:
SET z = ",,,JOHN,SMITH"
PEEL x z "" ","
You are asking for everything from the first character to the first
comma (which also happens to be the first character). Obviously, there is
nothing "between" the two characters, so the x variable would be set to ""
(an empty string). After the PEEL command, the z variable will be one
comma shorter (",,JOHN,SMITH").
The Left-Peeling Method
-----------------------
You can use the "left-peeling" method to take apart an entire line. This
is especially useful when interpreting a comma-delimited file.
SET z = ",,MARY,JONES,"
PEEL a z "" "," <-- Sets the a variable to ""
PEEL b z "" "," <-- Sets the b variable to ""
PEEL c z "" "," <-- Sets the c variable to "MARY"
PEEL d z "" "," <-- Sets the d variable to "JONES"
SET e = z <-- Sets the e variable to ""
The e variable is null because there is nothing after the last comma -- in
other words, the final field is empty. If the initial value of the z
variable was ",,MARY,JONES,99" then the e variable would be set to "99".
===========================================================================
POSITIONAL COMMANDS
===========================================================================
------------------
General Discussion
------------------
NOTE: If you are a programmer, you may be tempted to use positional
commands even when other Parse-O-Matic commands are more efficient.
The positional approach is reminiscent of the parsing strategies
used in traditional programming languages, so you may use them
because of their familiarity. The following material discusses
this issue, to help you to create shorter, faster POM files.
What are Positional Commands?
-----------------------------
Parse-O-Matic's positional commands let you work with the numeric position
of one text string in another. For example, if the variable xyz contains
the value "ABCD":
SEARCH POSITION
STRING IN xyz COMMENTS
------ -------- -----------------------------------------
"A" "1" "A" appears in the 1st position of "ABCD"
"AB" "1"
"ABCD" "1"
"C" "3" "C" appears in the 3rd position of "ABCD"
"CD" "3"
"D" "4"
"AC" "0" "0" since "AC" does not appear in "ABCD"
Why Use Positional Commands?
----------------------------
Positional commands give you the precise control you need for certain
difficult parsing tasks. For example, if you want to obtain the last three
characters of a string of known length (e.g. "ABCDEFG"), the standard
approach is:
SET abc = "ABCDEFG"
SET xyz = abc[5 7]
However, if the length of the string is not known, you can not use the
substrings in [square brackets]. (To make Parse-O-Matic run as fast as
possible for standard parsing jobs, you can not use variables within
square brackets.)
If the length of the string is not known, you need positional commands to
obtain the last three characters. Here is an example:
SET abc = "Unknown"
SETLEN len abc
CALC lenminus = len "-" "2"
COPY xyz abc lenminus len
The SETLEN command finds the length (i.e. the last position) of the abc
variable. In this case, the answer is "7", since "Unknown" is seven
characters long. The CALC command subtracts "2" from this length, setting
the lenminus variable to "5". Finally, the COPY command copies from
position "5" to "7", setting the variable xyz to "own" -- the last three
characters of the abc variable.
A Cautionary Note
-----------------
Positional commands are useful for some applications, but many parsing jobs
do not require them. The commands SET, IF, PARSE and PEEL can usually do
the same job with less effort. For example, the following approaches are
equivalent:
STANDARD APPROACH POSITIONAL APPROACH
----------------- ---------------------
SET abc "AB/CD" SET abc = "AB/CD"
PARSE xyz abc "/" FINDPOSN n abc "/"
COPY xyz abc n+
The positional approach requires more lines than the standard approach to
extracting the characters after the "/" character. Another problem is that
because positional commands give you fine control of the parsing process,
it is up to you to guard against exceptional situations. Consider this
example:
FINDPOSN x $FLINE "/"
CALC x = x "+" "1"
COPY xyz $FLINE x
If $FLINE (the current input line) contains the value "ABC/DEF":
FINDPOSN sets x to "4" (the position of the "/" character)
CALC increases x to "5"
COPY sets xyz to "DEF" -- from position "5" to the end of $FLINE
Unfortunately, a problem occurs if $FLINE does not contain a slash:
FINDPOSN sets x to "0" (meaning the "/" was not found)
CALC increases x to "1"
COPY copies from position "1" to the end of $FLINE
This may not be what you intended. If you want to return a null string
when $FLINE does not contain a slash, you could use a single PARSE command:
PARSE xyz $FLINE "/"
This copies anything after the slash to the xyz variable. If $FLINE does
not contain a slash, xyz is set to "".
The precise control provided by Parse-O-Matic's positional commands makes
them indispensible for certain parsing applications. Just remember that
with added power comes added responsibility: you will sometimes have to
add extra code to handle unusual situations.
------------------
The SETLEN Command
------------------
FORMAT: SETLEN var1 value1
PURPOSE: SETLEN sets var1 to the length of value1.
Here is an example of the SETLEN command:
SET x = "ABCD"
SETLEN y x
This sets variable y to "4".
One handy application for SETLEN is to underline text. For example:
SET name = $FLINE[1 15]
TRIM name "B" " "
SETLEN nlen name
SET uline = ""
PAD uline "L" "-" nlen
OUTEND |{name}
OUTEND |{uline}
If the input line contains the name "JOHN SMITH", the output would be:
JOHN SMITH
----------
For another example that does underlining, see "POM and Wildcards".
------------------
The DELETE Command
------------------
FORMAT: DELETE var1 value1 [value2]
PURPOSE: The DELETE command removes a range of characters (specified
as a starting and ending position) from a variable.
PARAMETERS: var1 is the variable from which characters will be removed
value1 is the starting position (e.g. "1" = First character)
value2 is the optional ending position; if it is omitted,
it is assumed to mean "the last character in var1"
NOTES: If value1 is null or "0", value1 = "1"
If value2 is null or "0", value2 = "last character in var1"
ALTERNATIVES: The PEEL, TRIM, CHANGE, SET and APPEND commands.
Here is an example of the DELETE command:
SET x = "ABC///DEF"
DELETE x "4" "6"
This deletes from position 4 to 6, so the variable x is set to "ABCDEF".
If value2 is omitted, DELETE assumes you wish to delete everything from
the starting position to the end of the string. For example:
SET x = "ABC///DEF"
DELETE x "4"
This sets x to "ABC".
----------------
The COPY Command
----------------
FORMAT: COPY var1 value1 value2 [value3]
PURPOSE: The COPY command copies a range of characters (specified as
a starting and ending position) from a value to a variable.
PARAMETERS: var1 is the variable being set
value1 is the source value, from which you will copy text
value2 is the starting position (e.g. "1" = First character)
value3 is the optional ending position; if it is omitted,
it is assumed to mean "the last character in value1"
NUMERICS: Tabs, spaces and commas are stripped from value2 and value3
NOTES: If value2 is null or "0", value1 = "1"
If value3 is null or "0", value3 = "last char in value1"
ALTERNATIVES: The SET command.
Here is an example of the COPY command:
SET x = "ABC///DEF"
COPY y x "4" "6"
This copies from position 4 to 6, so the variable y is set to "///".
If value2 is omitted, COPY assumes you wish to copy everything from the
starting position to the end of the string. For example:
SET x = "ABC///DEF"
COPY y x "4"
This sets y to "///DEF".
To make your POM files easier to read, you might consider padding the
COPY command with an equals sign to remind you that a variable is being
set. For example:
COPY y = x "4" "6"
This emphasizes that the variable y is being set to a substring of x.
For more information about padding, see "Padding for Clarity".
-------------------
The EXTRACT Command
-------------------
FORMAT: EXTRACT var1 var2 value1 [value2]
PURPOSE: The EXTRACT command works like COPY, but removes the
characters from the source variable after copying them to
a variable.
PARAMETERS: var1 is the variable that will contain the characters
extracted from var2
var2 is the variable from which characters will be copied
to var1, then removed
value1 is the starting position (e.g. "1" = First character)
value2 is the optional ending position; if it is omitted,
it is assumed to mean "the last character in var2"
NUMERICS: Tabs, spaces and commas are stripped from value1 and value2
NOTES: If value1 is null or "0", value1 = "1"
If value2 is null or "0", value2 = "last character in var2"
ALTERNATIVES: The PEEL command.
Here is an example of the EXTRACT command:
SET x = "ABC///DEF"
EXTRACT y x "4" "6"
This copies from position 4 to 6, so the variable y is set to "///".
The characters copied to variable y are removed from x, so that it now
contains the value "ABCDEF".
If value2 is omitted, EXTRACT assumes you wish to extract everything from
the starting position to the end of the string. For example:
SET x = "ABC///DEF"
EXTRACT y x "4"
This sets y to "///DEF", while the variable x is set to "ABC" (i.e. the
original value for x, with the extracted characters removed).
--------------------
The FINDPOSN Command
--------------------
FORMAT: FINDPOSN var1 value1 value2 [value3 [value4]]
: : : : :
MEANING: 1) Variable Source Find : :
2) Variable Source From To Control
PURPOSE: The FINDPOSN command finds one text string in another. It
locates the starting or ending position of a string, or
a string delimited by one or two other strings.
PARAMETERS: var1 is the variable that will contain the position if
the string is found (e.g. "2" means it was found
in the second position of value1; "0" means the
string was not found)
value1 is the string being searched
value2 is the string being sought, or...
the left-most part of a string being sought
value3 is the right-most part of the string being sought;
if it is set to null (""), it is assumed to mean
"the last character in value1"
value4 is the control setting
DEFAULTS: value4 = "IS"
SEE ALSO: This section is much easier to understand if you have
studied "The Parse Command".
There are two ways to use the FINDPOSN command: the "Plain String Find"
and the "Embedded String Find". These are discussed below.
The Plain String Find
---------------------
In its simplest form, the Plain String Find locates a string (value2) in
another string (value1) and assigns its position to a variable (var1).
Here is an example:
FINDPOSN x $FLINE "Fred"
This looks for the first occurrence of "Fred" in $FLINE (the current input
line). If $FLINE contains "Hello Fred!", the command will set the variable
x to "7", since "Fred" starts in the seventh character position.
Using a Single Decapsulator
---------------------------
Sometimes you don't want to find the FIRST occurrence, but the second,
third, and so on. You can use a single decapsulator (see "The Parse
Command") to specify this. For example:
SET z = "This is the way to demonstrate the FINDPOSN command"
FINDPOSN x z "the"
FINDPOSN y z "2*the"
The first FINDPOSN command finds the first occurrence of "the", using a
plain string, so it sets the variable x to "9", since the first "the"
starts in the ninth position.
The second FINDPOSN command uses a decapsulator with the occurrence number
"2*", which means "look for the second occurrence". Thus, it sets the
variable y to "32", since the second "the" occurs in that position.
Incidentally, the first FINDPOSN could also have been written this way:
FINDPOSN x z "1*the"
which is another way of saying, "Look for the first occurrence". However,
if no occurrence number is specified, FINDPOSN assumes you are looking for
the first occurrence.
The Encapsulated String Find
----------------------------
NOTE: The Encapsulated String Find is very similar to the PARSE command.
If you do not find the following discussion sufficiently
instructive, you can gain some additional insight by reading the
section of this manual entitled "The Parse Command".
The Encapsulated String Find looks for a string that is encapsulated by
(i.e. located between) two other strings. This is useful if your input
data contains text that is surrounded by delimiters. One common example is
the "comma-delimited" file (see "Why You Need Parse-O-Matic -- An Example"
for a sample). Here is another situation where data is surrounded by
delimiters:
|Mouse |Gazelle|Mouse |Elephant|
|Dog |Giraffe|Elk |Mongoose|
|Monkey|Snake |Caribou|Trout |
One can imagine an application that would create tabular data like this --
cleverly (but annoyingly) reducing the column widths to the minimum. This
would make the column starting and ending positions unpredictable.
You could use the PARSE command to obtain values from each column, but if
you have a lot of data, it would be more efficient to determine the
starting and ending positions at the outset.
Let's say you wanted to extract the third column. You could set up your
POM file like this:
BEGIN startposn = ""
FINDPOSN startposn $FLINE "3*|" "4*|" "XS"
FINDPOSN endposn $FLINE "3*|" "4*|" "XE"
HALT startposn = "0" "Missing delimiter!"
END
COPY animal $FLINE startposn endposn
OUTEND |{animal}
The lines between the BEGIN and END are run only once for the entire
parsing job, since they set the startposn variable to something other than
a null ("") string. (See "Uninitialized and Persistent Variables")
The first FINDPOSN command uses the decapsulators "3*|" and "4*|" to locate
the text between the third and fourth "|" delimiters, but because of the
"XS" control value (described later), startposn is set to the position
AFTER the delimiter. (Briefly, "XS" means "exclude the found text, and
refer to the starting position of the text that follows it.) Thus, the
variable startposn is set to "12"; "Mouse" starts in the twelfth position.
The second FINDPOSN command sets the ending position (endposn) in a similar
way. It finds the third and fourth "|" delimiters, but because of the "XE"
control setting, it sets endposn to the position BEFORE the fourth
delimiter. (Briefly, "XE" means "exclude the found text, and refer to the
ending position of the text that precedes it.)
The HALT command is simply a safeguard to ensure that the input data
follows the correct format. If the first FINDPOSN fails to find the
third or fourth "|" delimiter, it will set startposn to "0" (meaning "not
found").
The COPY command copies $FLINE (the current input line) from the starting
position (startposn) to the ending position (endposn). This value is then
output by the OUTEND command.
Control Settings
----------------
The control settings give you precise control of the part of the string
to which you are referring. Valid control settings are:
SETTING MEANING
------- -------
IS Include found text and report where the entire text starts
IE Include found text and report where the entire text ends
XS Exclude found text and report where the delimited text starts
XE Exclude found text and report where the delimited text ends
NOTE: While FINDPOSN greatly resembles the PARSE command, the default
control setting is different. In PARSE, the control setting is
assumed to be "X" if it is omitted. In FINDPOSN, however, the
control setting is assumed to be "IS" if it is omitted.
Let us assume that the we set the variable z as follows:
SET z = "ABzzzCDEFzzzGH"
This produces the following results:
COMMAND VALUE FOR x VARIABLE
--------------------------------- --------------------
FINDPOSN x z "1*zzz" "2*zzz" "IS" "3"
FINDPOSN x z "1*zzz" "2*zzz" "XS" "6"
FINDPOSN x z "1*zzz" "2*zzz" "XE" "9"
FINDPOSN x z "1*zzz" "2*zzz" "IE" "12"
The following illustration may make the results easier to understand:
+------------------------------------------------------------------------+
| |
| Measuring Scale: 12345678901234 |
| -------------- |
| Command: FINDPOSN x "ABzzzCDEFzzzGH" "zzz" "2*zzz" "<control>" |
| | | | | |
| Control Value: IS XS XE IE |
| Results: 3 6 9 12 |
| |
+------------------------------------------------------------------------+
In the example, the control values have the following specific meanings:
"IS" ("Include, Start") = start of entire text (from "1*zzz" to "2*zzz")
"XS" ("Exclude, Start") = start of text after the "from" item ("1*zzz")
"XE" ("Exclude, End") = end of text before the "to" item ("2*zzz")
"IE" ("Include, End") = end of entire text (from "1*zzz" to "2*zzz")
Insoluble Searches
------------------
FINDPOSN returns "0" (zero) when it can not find a string, or if it is
presented with an insoluble dilemma. Here are some examples:
FINDPOSN x "CatDog" "Moose" <-- "Moose" can not be found
FINDPOSN x "ABCDEF" "A" "G" <-- "G" can not be found
FINDPOSN x "ABCDEF" "A" "2*E" <-- There is no second "E"
Here is another insoluble search:
FINDPOSN x "ABCDEF" "C" "D" "XS"
FINDPOSN x "ABCDEF" "C" "D" "XE"
There is nothing between the "from" and "to" delimiters. Since we are
excluding the delimiters themselves (with "XS" and "XE" specifications), we
can not provide a "start" or "end" value for what we found -- we didn't
find anything! Hence, we have nothing for which to to return a starting or
ending position.
Null Decapsulators
------------------
Consider these next two commands:
FINDPOSN x "ABCDEF" "F" "" "XS"
FINDPOSN x "ABCDEF" "F" "" "XE"
What comes between "F" and the end of the string? Bear in mind, however,
that when you use a null ("") to mean "the last character", it is not
excluded (see "The Null Decapsulator" in the section entitled "The Parse
Command", for a discussion). Thus, the two FINDPOSN commands "find" the
final character "F", and both return "6".
These both return "6" because the "F" is both the starting and ending
position of what we found, and we included (rather than excluded) the
starting and ending delimiters ("F" and the last character, respectively).
Similarly, the following commands return a "1":
FINDPOSN x "ABCDEF" "" "A" "XS"
FINDPOSN x "ABCDEF" "" "A" "XE"
Even though there is nothing between "A" and "the first character", the
first character is not excluded, since we are using a null decapsulator.
As a result, we find the string "A" and return its position, which is "1".
Finding The Last Word
---------------------
One common use for FINDPOSN is to find the last occurence of a word in a
line of text. Consider the following lines:
SET z = "Parse-O-Matic is a fine product!"
FINDPOSN x z ">* " "" "XS"
This will set the x variable to 25 (the position of the final word). The
command looks for the last "space" character (which is in position 24),
then (because of the "XS" control) returns the position of the character
following it.
Who Needs This?
---------------
At this point, you may be wondering, "Why do I need to have this kind of
precise control?" Well, in most cases you don't, so you will tend to use
the "Plain String Find" (described earlier). However, certain complex
parsing applications demand that you make a distinction between the text
that encapsulated a piece of text, and the encapsulated text itself. When
faced with this kind of task, you will see that Parse-O-Matic's FINDPOSN
command lets you accomplish in one line what would take dozens of lines in
a traditional programming language.
===========================================================================
DATE COMMANDS
===========================================================================
------------------
General Discussion
------------------
Parse-O-Matic's date-oriented commands provide you with a convenient way to
work with dates. While you can accomplish the same thing using other
Parse-O-Matic commands (LOOKUP, PAD etc.), the date functions are optimized
for speed, so if your parsing job does a lot of date-format conversions, it
will run faster.
The POMDATE.CFG File
--------------------
When a date command is first executed, Parse-O-Matic reads in a file named
POMDATE.CFG. (The method by which Parse-O-Matic finds the file is
discussed in the section "How Parse-O-Matic Searches for a File".)
POMDATE.CFG is a self-documenting text file that contains the default
date format string (explained later), and the names of the twelve months.
You can edit this file with a standard text editor, or a word-processor in
"generic text" mode.
As originally supplied with Parse-O-Matic, the default date format string
is "?y/?n/?d", which produces YY/MM/DD dates (e.g. July 1 1998 becomes
98/07/01). You can change this to reflect your own preference.
If you are parsing data in a language other than English, you can also
change the names of the months.
Date Formats
------------
A date format is a sequence of characters that briefly describes the
appearance of a date. For example, the format "Y-T-?n" describes a
year/month/day format that looks like this: 1996-JULY-02
The following characters have a special meaning in the date format
string: d M m n T t Y y ?
For these special characters, uppercase and lowercase are important.
For example, "T" is not the same as "t".
All characters other than the special characters are interpreted "as-is",
and are included in the final date string.
The following table explains the meaning of the special characters used
to specify year, month and day, using the date July 2, 1998 for the
examples:
CHAR MEANING SAMPLE FORMAT SAMPLE RESULT
---- -------------------------- ------------- -------------
Y 4-digit year d-m-Y 2-Jul-1998
y 1- or 2-digit year d-m-y 2-Jul-98
n 1- or 2-digit month d/n/y 2/7/98
m 3-letter month d/m/y 2/Jul/98
M 3-letter month (uppercase) d M y 2 JUL 98
t Month t d, Y July 2, 1998
T Month (uppercase) T d Y JULY 2 1998
d Day y/m/d 98/7/2
The ? character can be used in the date format to pad out one-digit
values to two digits. The following table uses the date February 3, 2001
for the examples:
SAMPLE DATE FORMAT SAMPLE RESULT
------------------ -------------
y-?n-?d 1-02-03
?y/m/?d 01/Feb/03
?n/?d Y 02/01 2001
t '?y February '01
As the last example shows, it is not necessary to use month, day and year;
you can omit any item to obtain an abbreviated date.
-----------------
The TODAY Command
-----------------
FORMAT: TODAY var1 [value1]
PURPOSE: The TODAY command sets a variable (var1) to today's date, in
a variety of formats.
DEFAULTS: If value1 is not specified, TODAY uses the default date
format, which is specified in the file POMDATE.CFG.
NOTES: For a discussion of date formats (including the default date
format), see the "General Discussion" section at the
beginning of this chapter.
SEE ALSO: "The Date Command"
Assuming today's date is July 1 1998, here are some examples:
COMMAND THE VARIABLE xyz IS SET TO...
------------------- -----------------------------
TODAY xyz The default date format
TODAY xyz "" The default date format
TODAY xyz "Y-M-?d" 1998-JUL-01
TODAY xyz "t d Y" July 1 1998
TODAY xyz "t 'y" July '98
As the last example shows, it is not necessary to use month, day and year;
you can omit any item to obtain an abbreviated date.
----------------
The DATE Command
----------------
FORMAT: DATE var1 value1 value2 value3 [value4]
PURPOSE: The DATE command sets a variable (var1) to given year
(value1), month (value2) and day (value3), or a subset of
these items, in a variety of formats, as specified by the
format string (value4).
PARAMETERS: var1 is the variable being set
value1 is the year (e.g. "1998" or "98")
value2 is the month (e.g. "1" = January)
value3 is the day (1 to 31)
value4 is the date format
NUMERICS: Tabs, spaces and commas are stripped from value1, 2 and 3
DEFAULTS: If value4 is omitted, DATE uses the default date format,
which is specified in the file POMDATE.CFG.
NOTES: For a discussion of date formats (including the default date
format), see the "General Discussion" section at the
beginning of this chapter.
SEE ALSO: "The Today Command"
Assuming the date being set is July 1 1998, here are some examples:
COMMAND THE VARIABLE xyz IS SET TO...
---------------------------------- -----------------------------
DATE xyz "98" "07" "01" The default date format
DATE xyz "1998" "07" "01" "" The default date format
DATE xyz "98" "7" "1" "Y-M-?d" 1998-JUL-01
DATE xyz "98" "07" "01" "t d Y" July 1 1998
DATE xyz "98" "7" "01" "t 'y" July '98
DATE xyz "98" "7" "" "t 'y" July '98
As the last two examples show, it is not necessary to use month, day and
year; you can omit any item to obtain an abbreviated date.
If a date is outside a valid range, Parse-O-Matic halts with an error.
Acceptable value ranges are: Year 0 to 9999; Month 1 to 12; Day 1 to 31
If the year is between 0 and 99, Parse-O-Matic makes the following
assumptions:
- If the number is between 80 and 99, it means 1980 to 1999
- If the number is between 0 and 89, it means 2000 to 2089
Parse-O-Matic does not check that a date is "possible", so you could set
a date to "February 31, 2001", even though February never has 31 days.
--------------------
The MONTHNUM Command
--------------------
FORMAT: MONTHNUM var1 value1
PURPOSE: The MONTHNUM command sets the month number of a given month.
ALTERNATIVES: The LOOKUP command.
Here is an example of the MONTHNUM command:
MONTHNUM xyz "February"
This will set the variable xyz to "2".
The comparison is performed on the basis of the number of characters
available, without regard to case, so the following would also work:
MONTHNUM xyz "FEB"
If the result is ambiguous, Parse-O-Matic returns the first match. For
example:
MONTHNUM xyz "JU"
This will set xyz to "6", although it could refer to either June or July.
If MONTHNUM can not find a match, it will return a null ("") string.
For example:
MONTHNUM xyz "ZZZ"
Since no month starts with "ZZZ", this will set xyz to "".
If you are writing a Parse-O-Matic application that will be run in several
languages (using different POMDATE.CFG files), you should carefully study
the names of the months in each language to avoid problems. In English, it
is always sufficient to provide the first three letters. In French,
however, you need at least four letters, to distinguish between "Juin"
(June) and "Juillet" (July).
Parse-O-Matic can use only one POMDATE.CFG file at a time, so the MONTHNUM
command can not be used to translate month names from one language to
another. You can, however, accomplish the same thing with the LOOKUP
command.
--------------------
The ZERODATE Command
--------------------
** ADVANCED COMMAND FOR EXPERIENCED USERS **
FORMAT: ZERODATE value1 value2 value3
PURPOSE: Specifies "day zero" for the date serial number used by the
MAKETEXT command when it uses the DATE predefined data type.
PARAMETERS: value1 is the year (e.g. "1900")
value2 is the month (e.g. "12" for December)
value3 is the day (e.g. "5")
NUMERICS: Tabs, spaces and commas are stripped from value1, 2 and 3
DEFAULTS: If the ZERODATE command is omitted, the zero date is assumed
to be Jan. 1, 1753 (equivalent to ZERODATE "1753" "1" "1").
SEE ALSO: "The MakeText Command" and "Predefined Data Types"
A "date serial number" is a common method of representing a date in a data
file. It works by counting the number of days since a given date, taking
into account the extra days for leap years.
Leap years occur in every year that is divisible by four, with the
exception of century years -- unless they are divisible by 400. Thus, 1900
is not a leap year, but 2000 is.
The ZERODATE command specifies "Day 0". For example, if you specify
ZERODATE "1918" "11" "11" (November 11, 1918), you get the following:
DATE DATE SERIAL NUMBER
----------------- ------------------
November 9, 1918 -2
November 10, 1918 -1
November 11, 1918 0
November 12, 1918 1
November 13, 1918 2
... and so on. Most programs set the zero date far enough back that
negative numbers are not encountered in normal usage.
ZERODATE will not accept a starting year before "1753", which was the first
full year that most of the Western world started using the Gregorian
calendar.
===========================================================================
CALCULATION COMMANDS
===========================================================================
----------------
The CALC Command
----------------
FORMAT: CALC var1 value1 operation value2
PURPOSE: The CALC command performs an integer arithmetic operation on
the two values and assigns the answer to var1.
NUMERICS: Tabs, spaces and commas are stripped from value1 and value2
ALTERNATIVES: The CALCREAL command.
SEE ALSO: "Inline Incrementing and Decrementing"
Integer arithmetic refers to whole numbers. 1, 10 and 10000 are integers,
while 2.0, 3.14159 and 98.5 are not.
Let's say your input file looks like this:
DESCRIPTION UNITS SOLD UNIT PRICE
-----------------------------------------
Dog collar 15 $ 3.00
Cat collar 25 $ 2.50
Cat caller 3 $ 7.25
Birdie num-nums 1,305 $ 6.25
-----------------------------------------
End of Data
: : : : :
: : : : : (Column positions)
1 18 27 33 41
You can find out the total number of units sold (of all types) with the
following POM file:
IGNORE $FLINE[1 7] = "DESCRIP"
IGNORE $FLINE[1 7] = "-------"
BEGIN $FLINE = "End of Data"
OUTEND |Total units sold = {units}
ELSE
CALC units = units "+" $FLINE[18 27]
END
As you can see from the example, all spaces and commas are stripped from
the number. Tab characters (ASCII 09) are also stripped.
You will also notice that CALC can not be used for the prices, since they
are not integer data. To add up the prices, you must use the CALCREAL
command (see "The CalcReal Command").
Note in particular that the operation ("+" in this case) is in quotes. If
you omit the quotes, Parse-O-Matic will report an error.
The following operations can be performed with CALC:
SYMBOL DESCRIPTION
--------- --------------------------------------------
"+" value1 plus value2
"-" value1 minus value2
"*" value1 times value2
"/" value1 divided by value2 (remainder ignored)
"HIGHEST" the larger number (value1 or value2)
"LOWEST" the smaller number (value1 or value2)
Here are some more examples of the CALC command:
COMMAND ANSWER
-------------------------------- ------
CALC answer = "12" "/" "4" "3"
CALC answer = "12" "HIGHEST" "4" "12"
CALC answer = "12" "LOWEST" "4" "4"
CALC answer = "12" "-" "4" "8"
CALC answer = "12" "+" "4" "16"
CALC answer = "12" "*" "4" "48"
CALC can handle numbers between -2,147,483,648 and 2,147,483,647.
--------------------
The CALCREAL Command
--------------------
FORMAT: CALCREAL var1 value1 operation value2 [fixed-decimals]
PURPOSE: CALCREAL works the same way as CALC, except that it handles
decimal numbers.
NUMERICS: Tabs, spaces and commas are stripped from value1, value2,
and the "fixed-decimals" value
ALTERNATIVES: The CALC command.
Using the sample data given in the CALC section, you could write the
following POM file:
IGNORE $FLINE[1 7] = "DESCRIP"
IGNORE $FLINE[1 7] = "-------"
BEGIN $FLINE = "End of Data"
OUTEND |Total units sold = {units}
OUTEND |Total value sold = {value}
ELSE
CALC units = units "+" $FLINE[18 27]
CALCREAL value = value "+" $FLINE[33 41]
END
CALCREAL can handle values +/- 99,999,999,999, but its accuracy decreases
when you are dealing with large numbers, as approximated below:
Accurate to 1 decimal place between +/- 9,999,999,999
Accurate to 2 decimal places between +/- 999,999,999
Accurate to 3 decimal places between +/- 99,999,999
Accurate to 4 decimal places between +/- 9,999,999
Accurate to 5 decimal places between +/- 999,999
You can specify a fixed number of decimal positions in the answer by using
the optional "fixed-decimals" value. For example:
SET z = "3.14159"
CALCREAL x = z "+" "0" "2" <-- This sets x to "3.14"
CALCREAL x = z "+" "0" "4" <-- This sets x to "3.1415"
You will notice, in the second example, that no "rounding" takes place.
The number is simply truncated at the requested decimal position.
Here are some more examples of the CALCREAL command:
COMMAND ANSWER
----------------------------------------------- --------
CALCREAL answer = "12.0" "*" "4.0" "2" "48.00"
CALCREAL answer = "12.0" "HIGHEST" "4.0" "2" "12.00"
CALCREAL answer = "12" "LOWEST "4" "1" "4.0"
CALCREAL answer = "12" "-" "4" "3" "8.000"
CALCREAL answer = "12" "+" "4" "1" "16.0"
CALCREAL answer = "7" "/" "2" "2" "3.50"
CALCREAL answer = "7" "/" "2" "3.5"
CALCREAL answer = "7" "*" "2" "14.0"
As shown in the examples, if you do not use the optional fixed-decimal
value, calculations are in "floating point". That is to say, the answer
has as many decimal places as necessary. (Bear in mind the accuracy
restrictions mentioned earlier.) Trailing zeros are removed, unless there
are no digits after the decimal point, in which case a 0 is added.
--------------------
The CALCBITS Command
--------------------
** ADVANCED COMMAND FOR EXPERIENCED USERS **
FORMAT: CALCBITS var1 value1 operation value2
PURPOSE: CALCBITS performs logical operations
SEE ALSO: "The MakeData Command"
The CALCBITS command performs "bit-wise" operations on single bytes. The
following operations can be performed with CALCBITS:
SYMBOL DESCRIPTION
--------- ---------------------------------
"AND" value1 AND value2
"OR" value1 OR value2
"XOR" value1 XOR value2
"SHR" Shift value1 right by value2 bits
"SHL" Shift value1 left by value2 bits
Let us say you want to strip the high bit from all of the bytes in an input
file. You could accomplish this with the following POM file:
CHOP 1-1 <-- Read the input file one byte at a time
CALCBITS z $FLINE "AND" $7F <-- Remove the high bit from the byte
OUT |{z} <-- Send the result to the output file
Note that because we are reading the file one byte at a time, $FLINE is
always one byte long. Parse-O-Matic will terminate with an error message
if you attempt to use CALCBITS with a value longer than one byte. Thus,
assuming the variable xyz contains "ABCDEF", the following line is valid:
CALCBITS answer = xyz[3] "AND" $7F
However, the following line would not be permitted because it refers to
more than one byte:
CALCBITS answer = xyz[3 4] "AND" $7F
Here are some more examples of the CALCBITS command:
COMMAND ANSWER COMMENTS
------------------------------- ------ ------------------------------
CALCBITS answer = $FF "AND" $7F $7F
CALCBITS answer = "9" "AND" $39 $39 $39 is the character "9"
CALCBITS answer = $F0 "OR" $0F $FF
CALCBITS answer = $7F "XOR" $08 $77
CALCBITS answer = $80 "SHR" $01 $40 $80 = 10000000; $40 = 01000000
CALCBITS answer = $01 "SHR" $01 $00 $01 = 00000001; $00 = 00000000
CALCBITS answer = $01 "SHL" $01 $02 $01 = 00000001; $02 = 00000010
CALCBITS answer = $80 "SHL" $01 $00 $80 = 10000000; $00 = 00000000
In most of these examples, we use hex notation (e.g. $01), but you can also
use single characters (e.g. "3" which is equivalent to $33) or decimal
notation (e.g. #64 which is equivalent to $40). However, you should always
bear in mind that you are working with the underlying bit pattern. The
following lines are NOT equivalent:
CALCBITS answer = $7F "SHL" $01 <-- Shifts left one bit
CALCBITS answer = $7F "SHL" "1" <-- This is not the same!
The second line interprets "1" as hex $31 (decimal 49). There is obviously
no point in shifting an eight-bit byte 49 positions to the left.
===========================================================================
INPUT PREPROCESSORS
===========================================================================
-----------------
The SPLIT Command
-----------------
FORMAT: SPLIT from-position to-position [,from-pos'n to-pos'n] [...]
The maximum length of an input line from a text file is 255 characters. If
your input file is wider than that, you must break up the file into
manageable chunks, using the SPLIT command. This command lets you specify
the way in which each input line is broken up so that it will look like
several SEPARATE lines.
For example, if your input lines were up to 300 characters wide, you could
specify:
SPLIT 1 255, 256 300
This breaks up each line as if it was two lines. (If some of the lines are
less than 256 characters, they will still be treated as two lines, although
the second line will be null (i.e. empty).)
You can specify up to 130 splits (use multiple SPLIT commands if
necessary). With SPLIT, Parse-O-Matic can handle large input records,
up to a maximum total length of 32767 characters.
The best way of handling SPLIT or CHOPped files is to use a combination of
$SPLIT (explained in more detail later) and BEGIN/END. For example:
SPLIT 1 250, 251 300
BEGIN $SPLIT = "1"
SET a = $FLINE[ 1 10]
SET b = $FLINE[11 20]
END
BEGIN $SPLIT = "2"
SET x = $FLINE[ 1 10]
SET y = $FLINE[11 20]
OUTEND |{a} {b} {x} {y}
END
This outputs the data which appears (in the input file) in columns 1-10,
11-20, 251-260 and 261-280.
----------------
The CHOP Command
----------------
FORMAT: CHOP from-position to-position [,from-pos'n to-pos'n] [...]
CHOP 0
PURPOSE: Controls the number of bytes Parse-O-Matic will read from
the input file each time it processes the POM file.
SEE ALSO: "The Get Command"
The CHOP command works the same way as the SPLIT command, with one
exception: it informs Parse-O-Matic that the input is a fixed-record-
length file. In other words, it means that the input records are
distinguished by having a particular (and exact) length, rather than being
separated by end-of-line characters (Carriage Return, Linefeed) as is the
case for a standard text file.
Thus, if you have an input file containing fixed-length records, each of
which is 200 characters wide, you could specify it like this:
CHOP 1 200
If the input record is more than 255 characters, you must break it up into
smaller chunks. For example, if the input record was 300 characters wide,
you could break it up like this:
CHOP 1 250, 251 300
By using CHOP, Parse-O-Matic can handle input records up to 32767
characters wide. You can use the $SPLIT variable to manage your use of
CHOP. See the example in the section describing the SPLIT command.
Manual Reading
--------------
There is a special form of the CHOP command, which looks like this:
CHOP 0
This tells Parse-O-Matic that you will handle all file reading yourself. In
such case, $FLINE is always null. The only way to get data from the input
file is with the GET command.
When you use CHOP 0 for manual reading, the MINLEN and READNEXT commands
have no meaning. If you place them in the POM file, they are ignored.
===========================================================================
LOOKUP COMMANDS
===========================================================================
------------------
The LOOKUP Command
------------------
FORMAT: LOOKUP var1 value1
PURPOSE: The LOOKUP command searches for value1 in a text file (the
name of which is specified either by the LOOKFILE command or
the /L startup parameter). When POM finds it, it sets var1
to another value found on the same line.
ALTERNATIVES: The REMAP command.
Let us suppose you created a text file, named NAMES.TBL, like this:
R. REAGAN Ronald Reagan
D. EISENHOWER Dwight Eisenhower
G. BUSH George Bush
: :
Column 1 Column 18
This file can be used to look up a name, as in this POM file:
LOOKFILE "NAMES.TBL"
LOOKCOLS "1" "17" "18" "34"
SET oldname = $FLINE[21 37]
TRIM oldname "R" " "
LOOKUP newname = oldname
OUTEND |{oldname} {newname}
The LOOKFILE command specifies the name of the look-up file. The LOOKCOLS
command specifies the starting and end columns for both the "text-to-look-
for" field (known as the key field) and the "text-to-get" field (known as
the data field).
The LOOKUP command will look for oldname in NAMES.TBL. If oldname is set
to "G. BUSH", LOOKUP sets newname to "George Bush". If, however, oldname
is set to "G. WASHINGTON", which doesn't appear in NAMES.TBL, newname
is set to "" (that is to say, an empty string).
Search Method
-------------
When searching for the key field, LOOKUP compares text according to the
length of the string you are looking for. If your LOOKUP file looks like
this:
ABCDEF 456
ABC 678
XYZABC 345
XYZ 123
then the command LOOKUP x = "XYZ" would match on "XYZABC". If this search
procedure is a problem for you, there are two ways you can deal with it:
1) Pad your search strings before searching, as in this example:
PAD search "R" " " "6"
LOOKUP x = search
If the search variable was original set to "XYZ", the PAD command
would set it to "XYZ ", which would not match XYZABC.
2) Put the shorter key fields in the lookup file ahead of the longer
ones (of which they are a sub-string), as in this example:
ABC 678
ABCDEF 456
XYZ 123
XYZABC 345
It is worth pointing out that this look-up file is sorted in
ASCII order (whereas the example given earlier was not). A sorted
file can be more efficient, as explained in "The LookSpec Command".
Limitations
-----------
There is no limit to the number of lines that you can put in a look-up
file. However, the more lines there are, the longer it will take to
process (because there is more to search). The maximum length of a line
in a look-up file is 255 characters.
Null Lines and Comments
-----------------------
In the look-up file, null (empty) lines are ignored. You can also include
comments in the file by starting the line with a semi-colon:
; Some of the Presidents of the United States
R. REAGAN Ronald Reagan
D. EISENHOWER Dwight Eisenhower
G. BUSH George Bush
The LOOKUP command can be used for more than just names, of course. You
could use it to look up prices, phone numbers, addresses and so on.
Multiple Columns
----------------
You can use the same lookup file to find different items that are related
to the same key field. For example, let's say you have created a lookup
file, named EMPLOYEE.TBL, which looks like this:
; EMPLOYEE# NAME PHONE
00001 John Smith 555-1212
00002 Mary Jones 555-2121
00003 Fred Johnson 555-1122
You could look up an employee's name and phone number as follows:
LOOKFILE "EMPLOYEE.TBL"
LOOKCOLS "3" "7" "15" "37"
LOOKSPEC "N" "Y" "N"
LOOKUP empdata = "00002"
SET name = empdata[ 1 12]
SET phone = empdata[16 23]
TRIM name "B" " "
TRIM phone "B" " "
You could, of course, specify a different LOOKCOLS prior to each LOOKUP,
but that would mean reading the disk twice. It most cases, it is faster
to obtain the data all at once, then extract it.
LOOKUP Versus REMAP
-------------------
If you have only a few thousand bytes of lookup data, you might be able to
use the REMAP command instead of LOOKUP. However, you can not simply
replace LOOKFILE and LOOKUP with MAPFILE and REMAP. REMAP does not return
a null value if it can not find the item being sought, so you will have to
change your POM file to compare the original string with the revised
string, in order to see if it has changed (i.e. it was found). Even with
this test, REMAP might "fool you" if it finds a partial match.
If you are processing a lot of input data, using REMAP may speed up
processing, since REMAP works in RAM memory, while LOOKUP reads the disk.
However, if your disk uses "caching", the performance improvement may be
negligible.
--------------------
The LOOKFILE Command
--------------------
FORMAT: LOOKFILE value1
PURPOSE: The LOOKFILE command specifies the name of the look-up file
for the next LOOKUP command.
SEE ALSO: "How Parse-O-Matic Searches for a File"
LOOKFILE lets you use several look-up files in one POM file. For example:
SET name = $FLINE[1 20]
; Look up full name
LOOKFILE "NAMES.TBL"
LOOKCOLS "1" "25" "30" "50"
LOOKUP fullname = name
; Look up phone number
LOOKFILE "PHONE.TBL"
LOOKCOLS "1" "25" "30" "40"
LOOKUP phone = name
; Output result
OUTEND |{name} {fullname} {newname}
If you only have one look-up file, you may omit the LOOKFILE command and
specify the file name on the command line, using the /L parameter. For
example, you could write a POM file like this:
SET name = $FLINE[1 20]
; Look up full name
LOOKCOLS "1" "25" "30" "50"
LOOKUP fullname = name
; Output result
OUTEND |{name} {fullname}
Your POM command could then look like this:
POM MYPOM.POM INPUT.TXT OUTPUT.TXT /LC:\MYFILES\NAMES.TBL
This technique allows you to use several different look-up files with the
same POM file, simply by changing the command line. (The method by which
Parse-O-Matic finds the file is discussed in the section "How Parse-O-Matic
Searches for a File".)
The longest line allowed in a look-up file is 255 characters long.
--------------------
The LOOKCOLS Command
--------------------
FORMAT: LOOKCOLS value1 value2 value3 value4
PURPOSE: The LOOKCOLS command specifies the starting and ending
columns for the key and data fields in a look-up file (see
the explanation of the LOOKUP command for an overview of
look-up files).
PARAMETERS: value1 specifies the starting column for the key field
value2 specified the ending column for the key field
value3 specifies the starting column for the data field
value4 specified the ending column for the data field
NUMERICS: Tabs, spaces and commas are stripped from value1, 2, 3 and 4
You can specify a null value to indicate "same as last time". For example:
SET name = $FLINE[1 20]
LOOKFILE "NAMES.TBL"
LOOKCOLS "1" "25" "30" "50"
LOOKUP fullname = name
LOOKFILE "PHONE.TBL"
LOOKCOLS "" "" "" "40"
LOOKUP phonenum = name
OUTEND |{name} {fullname} {phonenum}
The second LOOKCOLS command uses the same numbers for the first three
values that the first LOOKCOLS command used.
If you do not specify a LOOKCOLS command, the default values are:
Key Field: Starting column = 1
Ending column = 10
Data Field: Starting column = 12
Ending column = 255
This is equivalent to LOOKCOLS "1" "10" "12" "255".
--------------------
The LOOKSPEC Command
--------------------
FORMAT: LOOKSPEC value1 value2 value3
PURPOSE: The LOOKSPEC command configures the way the next LOOKUP
command will work.
PARAMETERS: value1 = Trim ("Y" or "N" -- default "Y")
value2 = Sorted ("Y" or "N" -- default "N")
value3 = Case-sensitive ("Y" or "N" -- default "N")
The Trim setting specifies whether or not the data field should have spaces
stripped off both ends.
The Sorted setting specifies whether or not the look-up file is sorted by
the key field. A sorted file is much faster than an unsorted file. This
is especially noticeable if you have a large look-up file and a lot of
input to process.
The Case-sensitive setting specifies whether or not LOOKUP should distin-
guish between upper and lower case when searching. The default setting is
"N" (No), so that LOOKUP would find "John Smith", even if it appeared in
the look-up file as "JOHN SMITH". It is usually safest to set Case-
sensitivity to "N", but if you set it to "Y", searching is slightly faster.
You can specify a null value to indicate "same as last time". For example:
SET name = $FLINE[1 20]
LOOKFILE "DATA.TBL"
LOOKCOLS "1" "25" "30" "50"
LOOKSPEC "Y" "Y" "Y"
LOOKUP fullname = name
LOOKCOLS "" "" "60" "70"
LOOKSPEC "N" "" ""
LOOKUP phonenum = name
OUTEND |{name} {fullname} {phonenum}
The second LOOKSPEC command uses the same settings for Sorted and Case-
sensitivity as the first one, but specifies a different Trim setting.
===========================================================================
DATA CONVERTERS
===========================================================================
--------------------
The MAKEDATA Command
--------------------
** ADVANCED COMMAND FOR EXPERIENCED USERS **
FORMAT: MAKEDATA var1 value1 value2
PURPOSE: MAKEDATA converts text data into a binary format.
PARAMETERS: var1 is the variable being set
value1 is the text data you want to convert
value2 is the predefined data type you want to create
NUMERICS: Tabs, spaces and commas are stripped from value1, if it is
numeric (as indicated by value2)
SEE ALSO: "Predefined Data Types" and "The CalcBits Command"
When you are writing to a binary file (using the OUT command), you often
need to convert text information to a binary representation. MAKEDATA
recognizes many standard data formats (see "Predefined Data Types").
Creating Binary Data
--------------------
Let us say you have a four-line text file that looks like this:
1234
-456
23
90211
Here is a POM file that reads the numbers from the file, then outputs them
in binary format, as 16-bit signed integers:
MAKEDATA z $FLINE "INTEGER" <-- Convert the number to an integer
OUT |{z} <-- Send the integer to the output file
We use OUT instead of OUTEND, since OUTEND would put an end-of-line
(Carriage Return, Line Feed) after the data.
If the POM file shown in the example was run with the input data shown,
it would create an output file containing four integers. In other words,
the file would be eight bytes long (four integers of two bytes each).
Converting Dates
----------------
In some files, a date serial number (see "The ZeroDate Command") might be
represented by a numeric format such as INTEGER or LONGINT. To write a
date serial number to the output file, you must first convert the date with
MAKEDATA, then use MAKEDATA again to convert the resulting number to the
appropriate data type.
The value1 part of the MAKEDATA must be in a precise format:
"YYYY [M]M [D]D" <-- Square brackets indicate optional digits
That is to say:
1) A four-digit year
2) A space
3) A one or two digit month (January = 1 or 01, December = 12)
4) A space
5) A one or two digit day of the month (e.g. 1 or 01 or 31)
You can assemble the date string from various other data, using the
DATE command. Let us say you have a one-line text file that contains
the date in Month-Day-Year format:
01-01-2001
You can read this file and output a date serial number as a long integer
(LONGINT) with the following POM file:
ZERODATE "2000" "1" "1" <-- Set "day zero"
SET year = $FLINE[7 10] <-- Get the year
SET month = $FLINE[1 2] <-- Get the month
SET day = $FLINE[4 5] <-- Get the day of the month
DATE x year month day "Y ?n ?d" <-- Set x to "2001 01 01"
MAKETEXT y x "DATE" <-- Set y to "366"
MAKEDATA z y "LONGINT" <-- Set z to a long integer
OUT |{z} <-- Place it in the output file
A typical problem with date data is that the year does not include
the first two digits (e.g. you have "97" instead of "1997"). In
such cases, your POM file has to make a decision as to which century
the date belongs to. Here is one way to handle this situation:
BEGIN year #>= "50"
CALC year = year + "1900"
ELSE
CALC year = year + "2000"
END
This works around the problem as follows:
Any year between is placed in Examples
---------------- ---------------- --------------
"50" and "99" the 20th century "1950" "1999"
"00" and "49" the 21st century "2000" "2049"
You have to be careful when choosing the "cut-off date" (1950, in the
example above). You should make your decision only after studying your
input data carefully.
Practical Considerations
------------------------
The examples shown here assume that your input file contains only one kind
of data. In most cases, you will use the CHOP command to obtain complete
data records of fixed length, then use SET to extract portions thereof.
If you are reading a file with variable-length records, you can use CHOP 0
(manual reading) and the GET command.
--------------------
The MAKETEXT Command
--------------------
** ADVANCED COMMAND FOR EXPERIENCED USERS **
FORMAT: MAKETEXT var1 value1 value2
PURPOSE: MAKETEXT converts binary data into text format.
PARAMETERS: var1 is the variable being set
value1 is the data you want to convert
value2 is the predefined data type of value1
NOTES: value1 is normally in binary (i.e. it looks like "garbage
characters" if you output it to a text file). However, if
value2 specifies the DATE data type, value1 must be in text
form (e.g. "1234"). The reason for this difference is
described in the "Converting Dates" section below.
SEE ALSO: "Predefined Data Types"
When reading a binary file (using the CHOP command), you often need to
convert binary information to a text representation. MAKETEXT recognizes
many standard data formats (see "Predefined Data Types").
Converting Binary Data
----------------------
Let us say you have a binary file that contains several WORD values
(unsigned integers, each of which is 2 bytes long). You can read and
decode them with the following POM file:
CHOP 1-2 <-- Read the file two bytes at a time
MAKETEXT x $FLINE "WORD" <-- Convert the WORD to text format
OUTEND |{x} <-- Output the data to a text file
Converting Dates
----------------
MAKETEXT can convert a date serial number (see "The ZeroDate Command") to
a formatted date. Since there is no standard data format for date serial
numbers, you must use MAKETEXT to convert the number into text form, and
then use MAKETEXT again to format the date.
Let us say you have a binary file that contains dates, represented as
LONGINTs (4-byte signed integers). You could convert them to dates with
the following POM file:
CHOP 1-4 <-- Read 4 bytes at a time
ZERODATE "1936" "1" "1" <-- Set the "zero date"
MAKETEXT x $FLINE "LONGINT" <-- Convert the binary data to a text number
MAKETEXT y x "DATE Y-M-?d" <-- Convert to text date (e.g. "1998-JUL-01")
OUTEND |{y} <-- Output the date to a text file
Practical Considerations
------------------------
The examples shown here assume that your input file contains only one kind
of data. In most cases, you will use the CHOP command to obtain complete
data records of fixed length, then use SET to extract portions thereof.
If you are reading a file with variable-length records, you can use CHOP 0
(manual reading) and the GET command.
===========================================================================
MISCELLANEOUS COMMANDS
===========================================================================
-----------------
The ERASE Command
-----------------
FORMAT: ERASE value1
PURPOSE: Deletes a file (if it exists).
PARAMETERS: value1 is the name of the file to be deleted
SEE ALSO: "Long File Names in Win95"
Here is an example of the ERASE command:
ERASE "C:\XYZ.TXT"
This will delete the file C:\XYZ.TXT if it exists. If it does not exist,
nothing is done.
You can not delete the current input file, output file, trace file or
lookup file. If you attempt to do so, Parse-O-Matic will terminate with
an error.
You can not delete a device (e.g. ERASE "LPT1:"). The ERASE command
simply ignores such requests.
If value1 is preceded by a "+" character, the plus sign is ignored.
See "How Parse-O-Matic Opens an Output File" for an explanation of the
significance of the plus sign.
------------------
The GETENV Command
------------------
FORMAT: GETENV var1 value1
PURPOSE: GETENV obtains a system environment variable
PARAMETERS: var1 is the variable being set
value1 is the name of the system environment variable
NOTES: System environment variables are sometimes referred to as
"DOS Environment Variables" or "SET Variables".
SEE ALSO: Explanations of the SET & PATH commands in your DOS manual,
or "The Environment Area" in your Windows or OS/2 manual.
GETENV enables you to access certain important settings that concern your
computer's operating system. To see what settings are available, enter the
following command at the DOS prompt:
SET
This will display the contents of your computer's "environment area". Two
of the most important values in the environment area are COMSPEC and PATH.
These are briefly described later, but refer to your operating system
manual for full details.
GETENV removes all spaces, tabs and equals-signs ("=") from value1,
converts it to uppercase, then looks it up in the system environment area.
- If it finds it, var1 is set to the corresponding value.
- If it does not find it, var1 is set to an empty (null) string.
Disappearing Environment Variables
----------------------------------
Sometimes an environment variable disappears for no apparent reason. There
are two likely reasons for this:
1) You ran out of environment space.
There is only a limited amount of room in the system environment area
(which is located in RAM memory). If you think this is the problem,
type your DOS SET command to save a variable into the system
environment, then type SET by itself to review the contents of the
environment. If your variable does not appear, consult your operating
system manual to find out how to expand your environment space.
2) It was set by a COPY of the operating system.
If you are in Windows and you run DOS, then use the DOS SET command, it
will only affect the environment area associated with the copy of DOS
that you are running. When you exit this copy and start up another
one, it will not contain the variable. You can address this problem by
setting the variable in your AUTOEXEC.BAT file, or by running a batch
file that sets the variable before running Parse-O-Matic.
Examples
--------
The following command will determine which directories get searched when
you are looking for a program or a file:
GETENV path "PATH"
To find out the name of your command interpreter (usually COMMAND.COM)
and where it is located, try this command:
GETENV comspec "COMSPEC"
You can use GETENV as a simple "input routine" for Parse-O-Matic
applications. For details, see "Controlling a POM File from the Command
Line", in the section entitled "Effective Use of Batch Files".
---------------
The LOG Command
---------------
FORMAT: LOG value1 [comparator] value2 value3 [value4 [value5]]
PURPOSE: LOG places a message (value3) in the processing log file
(POMLOG.TXT) if the comparison is true. Both value4 and
value5 are optional; if they are present, they are added
to end of value3.
NOTES: The processing log is described in the section "Logging".
Here is an example of the LOG command:
SET emplnumb = $FLINE[ 1 9]
SET sales = $FLINE[10 20]
TRIM sales "B" " "
LOG sales = "0" "WARNING! Zero sales for employee number:"
LOG sales = "0" emplnumb
This adds two warning lines to the processing log if the sales figures is
zero.
The logging feature lets you run Parse-O-Matic unattended, then come back
later to review (via the processing log) any exceptional conditions. For
some additional comments on logging, see "Unattended Operation".
The maximum length of a LOG string (value3, plus value4 and value5 if
present) is 245 characters.
-------------------
The MSGWAIT Command
-------------------
FORMAT: MSGWAIT value1
PURPOSE: MSGWAIT controls the amount of time that a processing error
message appears on the screen before it times out. (Messages
from the HALT command are treated as error messages.)
PARAMETERS: value1 is the delay time in seconds
NUMERICS: Tabs, spaces and commas are stripped from value1
DEFAULTS: If the MSGWAIT command is not included in the POM file, and
an error occurs, Parse-O-Matic will wait until you press a
key; the message will not time out.
NOTES: If value1 is "0", error messages will not time out.
The maximum value for value1 is 60000 (about 16 hours).
You can set value1 to "1", but one second is usually too
short a delay; a value of "60" (one minute) is better.
SEE ALSO: "The Halt Command", "Unattended Operation", "Quiet Mode"
The MSGWAIT command lets you control the behaviour of error messages that
appear during the processing of an input file. This is helpful if you
have created POM applications that are run unattended.
If Parse-O-Matic was invoked by a batch file or application program, you
want may error messages to "time out", allowing Parse-O-Matic to terminate,
and processing to continue.
Standard Behaviour
------------------
If Parse-O-Matic encounters an error while reading in a POM file (i.e.
during the "compile" step), it displays a message on the screen and waits
until you press a key. Parse-O-Matic then terminates.
When running the actual POM file (i.e. while processing the input file),
Parse-O-Matic will normally behave the same way: if an error occurs (or
if a HALT command is encountered), it will display a message on the screen
and wait for you to press a key before it terminates.
Setting a Time-Out Delay
------------------------
You can use the MSGWAIT command to tell Parse-O-Matic to continue ("time
out") after a certain number of seconds. For example:
MSGWAIT "60"
This tells Parse-O-Matic to wait about 60 seconds if an error is
encountered while processing the input file. Parse-O-Matic will then
terminate. (The actual delay depends on the type of computer you are
using; a delay of "60" will typically last between 55 and 65 seconds).
Color Cues
----------
If you have a color monitor, you can tell if a message will "time out"
by the color of the "Press a key to continue" prompt:
- If it is magenta (sometimes called "purple") it will NOT time out
- If it is blue, if WILL time out
Key Stacking
------------
To ensure that an error message is not inadvertently bypassed, "stacked"
keystrokes are ignored by Parse-O-Matic. That is to say, if you press
several keys before an error message is displayed, Parse-O-Matic gets rid
of them before displaying the message.
Exceptions
----------
If Parse-O-Matic is processing an empty input file, it will display the
warning "Input file is empty", then continue processing the POM file when
you press a key, or after a delay of about 60 seconds.
The MSGWAIT command does not affect messages that report errors detected
during the compilation (initial read-in) of the POM file.
The MSGWAIT command does not affect the "Retry or Cancel" message that
appears if you are dealing with a device (see "Sending Output to a
Device").
A Word of Caution
-----------------
A POM file should be thoroughly tested before setting the MSGWAIT time to a
value other than "0". Most error messages are serious enough to justify
waiting until the user acknowledges them.
If you call Parse-O-Matic from a batch file or application program, you can
check the success of the parsing job by checking the return code. (See
"Effective Use of Batch Files" and "Running Parse-O-Matic from Another
Program").
If there was a processing error and you did not check the parsing job
(either by testing the program return code, or by consulting the
processing log), the resulting oversight could be serious.
-----------------
The PAUSE Command
-----------------
FORMAT: PAUSE value1
PURPOSE: Delays the specified number of milliseconds
PARAMETERS: value1 is the delay time (between 1 and 65500)
NUMERICS: Tabs, spaces and commas are stripped from value1
NOTES: 1 millisecond = One thousandth of a second
100 milliseconds = One tenth of a second
1000 milliseconds = One second
60000 milliseconds = One minute
Here are some typical applications of the PAUSE command:
- Slow down Parse-O-Matic so you can watch the processing screen
- Give a slow laser printer extra time to eject a page after an OUTPAGE
- Give you time to remove a page from a dot-matrix printer after an OUTPAGE
- Give a communications device time to complete its current operation
Here is an example of the latter application:
OFILE "COM1:" <-- Direct output to the modem on COM1
OUTEND |ATZ <-- Send a modem initialization command
PAUSE "1000" <-- Wait one second for the command to complete
OUTEND |ATDT555-1234 <-- Send a dialing command to the modem
If your PAUSE command is 200 milliseconds or longer, Parse-O-Matic displays
a "PAUSED" message in the lower right corner of the processing screen.
While this appears, you can press any key to end the pause. (We recommend
that you use the spacebar -- and avoid the Esc key. Parse-O-Matic
processing will be terminated if the PAUSE happens to end at the precise
moment your finger is coming down on the Esc key!)
-----------------
The SOUND Command
-----------------
FORMAT: SOUND value
PURPOSE: The SOUND command performs two functions:
1) It makes a noise, or ...
2) It sets the noise made when an error occurs
The SOUND command has a repetoire of nine distinctive noises:
BEEP BIP BUZZ EDGE ERROR HUH PIP TRILL WHOOP
These sounds are useful for alerting you to unusual situations. Let's say
you wanted to be warned if one of the fields in a file comes up blank. You
could write the code this way:
BEGIN lastname = ""
SOUND "WHOOP"
SET lastname = "?"
END
Case is not important; the following commands are all equivalent:
SOUND "WHOOP"
SOUND "Whoop"
SOUND "whoop"
The LISTEN Utility
------------------
You can listen to any given sound by using the LISTEN command at the DOS
prompt. To hear what TRILL sounds like, enter this command:
LISTEN trill
By default, Parse-O-Matic error messages will alert you by playing the
ERROR sound. To hear this sound, enter the following command at the DOS
prompt:
LISTEN error
Changing the Error Message Sound
--------------------------------
If you find the error message sound noise annoying, you can replace it with
one of the other sounds by using the special ERRMSG specification of the
SOUND command. For example, to replace the ERROR sound with the BUZZ
sound, place this line at the top of your POM file:
SOUND "ERRMSG BUZZ"
If you don't want any sound made when an error occurs, use this command:
SOUND "ERRMSG QUIET"
The ERRMSG specification will only affect errors generated during the
actual running of the POM file. If an error is encountered while
Parse-O-Matic is compiling the POM file, it will use the ERROR sound
when it reports the problem.
-----------------
The TRACE Command
-----------------
FORMAT: TRACE var1
PURPOSE: The TRACE command is an alternative to standard tracing (see
"Tracing", in the "Techniques" section).
PARAMETERS: var1 is the variable being traced.
When you include a TRACE command in your POM file, Parse-O-Matic will
create a text file, named POM.TRC, and use it to keep a detailed record of
POM's processing. Here is an example of the TRACE command:
TRACE PRICE
This traces the variable named "PRICE". After processing, the file POM.TRC
will show everything that happened, and give the value of PRICE at the
TRACE line.
NOTE: Since trace files are so detailed, they can be very large. If you
are trying to debug a POM file using TRACE, it is a good idea to use a
small input file.
===========================================================================
TERMS
===========================================================================
------
Values
------
A value can be specified in the following ways:
"text" A literal text string
#number An ASCII character, in decimal (e.g. #32 = Space)
#number#number... Several ASCII characters (e.g. #32#32 = 2 Spaces)
$xx A byte, in hexadecimal (e.g. $2F = decimal 47)
$xx$xx... Several hex bytes ($ff$ff = binary 1111111111111111)
VARNAME The name of a variable
VARNAME[start end] A substring of a variable
VARNAME[start] A single character
VARNAME+ Inline incremented variable (explained below)
VARNAME- Inline decremented variable (explained below)
Variable names can be up to 12 characters long. There is no distinction
between upper and lower case in the variable name. A POM file can contain
about 1000 variables and literals.
The # character is used to specify a literal text string of one or more
characters. Follow each # with the decimal value of the ASCII character
you want. Here are some useful values:
#10 = Line Feed #12 = Form Feed #13 = Carriage Return
Parse-O-Matic predefines several variables. They are:
$FLINE = The line just read from the file (max. length 255 chars)
$FLUPC = The line just read from the file, in uppercase
$BRL = The { character (used in OUT)
$BRR = The } character (used in OUT)
$COMMAND = The current POM command line (see "POM and Wildcards")
$SPLIT = The CHOP or SPLIT number you are currently processing
$TAB = The tab character (Hex $09; ASCII 09)
Although these predefined variables start with a dollar sign ($), it does
not mean they are in some way "hexadecimal" (as in the case of the hex
values mentioned earlier). In this case, the $ character is simply a means
to indicate that the variables are defined by Parse-O-Matic. In general,
you should avoid creating variables that start with anything but a letter.
Since $FLINE has a maximum length of 255 characters, you will have to use
the SPLIT or CHOP command if your input file is wider than that. The
$SPLIT variable reports which segment you are processing. For example,
if you had this command...
CHOP 1 255, 256 380
then $SPLIT would be set to "1" when it was processing columns 1 to 255,
and it would be set to "2" when it was processing columns 256 to 380.
----------
Delimiters
----------
If you need to specify a quotation mark, use "". For example:
IGNORE $FLINE = "He said ""Hello"" to me."
This ignores any line containing: He said "Hello" to me.
------------------
Illegal Characters
------------------
No POM command can contain these ASCII characters:
HEX DECIMAL NAME
------- ------- --------------------
$00 #00 NULL
$0A #10 LF (Linefeed)
$0D #13 CR (Carriage Return)
Of course, LF and CR do appear at the end of each line in the POM file,
which is a text file. If you have to specify these characters in a POM
command, use either the $ or # character to denote hex or decimal literals
(e.g. SET linefeed = $0A).
-----------------
Using Comparators
-----------------
Several POM command decide what to do by comparing two values. For example:
IF $FLINE[1 3] = "XYZ" THEN x = "3" ELSE "4"
In this example, if the first three characters of $FLINE are "XYZ", the
variable x is set to "3", otherwise it is set to "4". The first equals
sign ("=") is a "comparator", because it defines how two values will be
compared. The second equals sign is not a comparator; it is simply
padding, which makes the line easier to understand (see the section
"Padding for Clarity" for details).
Parse-O-Matic supports the following comparators:
COMPARATOR INTERPRETATION MEANING COMMENTS
---------- -------------- -------------------- --------
= Literal Identical
<> Literal Not identical
> Literal Higher See NOTE
>= Literal Higher, or identical See NOTE
< Literal Lower See NOTE
<= Literal Lower, or identical See NOTE
^ Literal Contains
~ Literal Does not contain
LONGER Literal Length is longer
SHORTER Literal Length is shorter
SAMELEN Literal Length is the same
#= Numerical Equal
#<> Numerical Not equal
#> Numerical Greater
#>= Numerical Greater, or equal
#< Numerical Less than
#<= Numerical Less than, or equal
NOTE: Depends on PC-ASCII sort order. Refer to the section "Literal
Comparisons and Sort Order" for details.
Whenever a comparator is required, but is omitted, it is assumed to be
"literally identical". Thus, the following lines are equivalent:
IF x y z "3" "4" (This is very terse, but it works)
IF x y THEN z = "3" ELSE "4" (The "equals" comparator is omitted)
IF x = y THEN z = "3" ELSE "4" (This is a lot easier to read)
With some restrictions (discussed later), literal comparators work on
numeric and alphabetic data. Here are some examples of literal comparisons
that are "true":
"ABC" <> "ABCD" "3" <> "4"
"ABC" <= "ABCD" "3" <= "4"
"ABC" < "ABCD" "3" < "4"
"ABC" SHORTER "ABCD" "3" SAMELEN "4"
"ABC" >= "ABC" "ABC" <> "CDE"
"ABC" <= "ABC" "ABC" <= "CDE"
"ABC" = "ABC" "ABC" < "CDE"
"ABC" ^ "ABC" "ABC" SAMELEN "CDE"
"ABC" SAMELEN "ABC" "ABC" ~ "CDE"
Literal Comparisons and Sort Order
----------------------------------
Some of the literal comparators compare text according to "PC-ASCII sort
order". For plain English text, this works fine. However, if your text
contains diacritical (accented) characters, you should be aware that
some comparisons will not work correctly. For example, the "A-Umlaut"
character appears in the PC-ASCII character set AFTER the PC-ASCII value
for "Z".
Numeric Comparisons
-------------------
Some confusion can arise if you use literal comparators on numbers. For
example, this doesn't work as you might expect at first glance:
SET count = count+
BEGIN count >= "2"
OUTEND x = x |{count}
END
You might expect this POM file to output any number greater than or equal
to "2", but in fact, you will get a different result, because the
comparison is a literal (text) comparison. In the example above, "2" to
"9" are greater or equal to "2", but "10" (which starts with "1") is less,
as is evident when you sort several numbers alphabetically:
1
10
11
15
100
2
20
200
3
30
As you can see, the values 1, 10, 11 and 15 come before "2" when sorted
alphabetically.
To compare numbers, you should use the numeric comparators. The correct
way to code the previous example is as follows:
SET count = count+
BEGIN Count #>= "2" <-- Note the #>= comparator
OUTEND x = x |{count}
END
Written in this way, numbers greater than or equal to two will be output.
Here are some examples of numeric comparisons that are "true":
"345" #<> "567" "1.23" #<> "9.87"
"345" #<= "567" "1.23" #<= "9.87"
"567" #> "345" "9.87" #> "1.23"
"3" #< "6.2"
The last example compares an integer ("3") with a real number ("6.2"). The
numeric comparators automatically check if one of the numbers contains a
decimal point. In such case, the comparison is performed in "real number"
mode, which imposes the accuracy restrictions described in the section "The
CalcReal Command". This might create a problem if you are comparing a
decimal number with a large integer, but this is not a cause for much
worry, since most parsing jobs tend to compare similar types of numbers.
Upgrading from Earlier Versions
-------------------------------
IF YOU USED PARSE-O-MATIC PRIOR TO VERSION 3.00: Because the comparator
defaults to "literally identical" if it is omitted, POM files created
before version 3.00 will continue to function normally -- with two notable
exceptions. In older versions, the IGNORE and ACCEPT commands defaulted to
"contains". If you have POM files that were created for older versions, you
should check your IGNORE and ACCEPT commands to ensure that they are doing
what you want them to.
---------------------
Predefined Data Types
---------------------
For certain commands (e.g. MAKEDATA, MAKETEXT, GET and GETTEXT),
Parse-O-Matic has internal definitions of certain data representations;
these are known as Parse-O-Matic's "predefined data types":
DATA TYPE BYTES MINIMUM VALUE MAXIMUM VALUE COMMENTS
--------- ----- ------------- ------------- -----------
BYTE 1 0 255
INTEGER 2 -32768 32767
LONGINT 4 -2147483648 2147483647
REAL 6 -9999999999.9 9999999999.9 See NOTE #1
SHORTINT 1 -128 127
WORD 2 0 65535
DATE - - - See NOTE #2
TRIMMED - 0 chars 255 chars See NOTE #3
NOTE #1: The minimum and maximum values depend on the number of digits of
precision. See "The CalcReal Command" for details.
NOTE #2: The DATE type does not have a specific length. In some input
files, a date serial number might be represented by a numeric
format such as INTEGER or LONGINT. For more information, see
the discussions of the MAKETEXT, MAKEDATA and GETTEXT commands.
NOTE #3: The TRIMMED type does not have a specific length. You can use
it with MAKETEXT and GETTEXT commands to remove the spaces, tabs
and nulls on either side of a string. It can also be used with
MAKEDATA, but since this can produce a field of indeterminate
length, it is rarely useful in such a role.
Certain predefined data types can have a qualifier, which provides
additional information. All commands that use predefined data types will
accept the qualifier, but only the MAKETEXT command makes use of it.
DATA TYPE QUALIFIER DESCRIPTION EXAMPLES
--------- ------------------------ -------------------------------------
REAL Number of decimal places "REAL 2" -> 3.14 "REAL 4" -> 3.1415
DATE Date format "DATE ?y/?n/?d" -> "96/12/01"
Interpreting Data Formats in a File
-----------------------------------
When inspecting a hex dump of a binary file, bear in mind that on
PC-compatible computers, the bytes that comprise a number are often
reversed. For example, for the INTEGER and WORD data types, the eight most
significant bits of numeric values are usually placed AFTER the eight least
significant bits. Thus, the decimal value 5099 will appear as EB 13 in the
file, not 13 EB, despite the fact that decimal 5099 equals hex 13EB.
If you are dealing with data that treats numbers differently, you can
sometimes work around the problem by reversing the order of the bytes
before performing the conversion. For example, if the file contains a WORD
data type, but has the most significant byte FIRST, you can switch things
around, as demonstrated by this POM file:
CHOP 0 <-- Read the file manually
GET x "WORD" <-- Get two bytes from the file
APPEND y = x[2] x[1] <-- Flip the bytes around
MAKETEXT z y "WORD" <-- Convert the number
OUTEND |{z} <-- Output the result
===========================================================================
TECHNIQUES
===========================================================================
--------------------------------------
Uninitialized and Persistent Variables
--------------------------------------
Even before a variable is assigned a value (using the SET command, for
example), you can use it in a POM command. An uninitialized variable has a
null value ("") and is treated normally by all commands.
EXCEPTION: To help you catch coding errors, the OUT and OUTEND commands
do not allow you to output an uninitialized variable. If you attempt to
do so, Parse-O-Matic issues a warning, and processing is terminated.
Variables are "persistent": once you have assigned a value to a variable,
it retains that value until it is changed. Even if you open a new input
file (see "POM and Wildcards") or a new output file (see "The OFile
Command"), all variables will retain their values; they will not be "reset"
back to null. (Of course, when Parse-O-Matic ends, all variables
disappear; they are not retained between separate runs of POM.)
Example
-------
Here is an example which illustrates why persistent variables are useful:
PAGELEN "55" <-- Set page length
SET partnum = $FLINE[ 1 10] <-- Extract the part number
SET descrip = $FLINE[12 60] <-- Extract the description
BEGIN lastpart <> partnum <-- Is this a new part number?
OUTPAGE <-- Generate a page eject
OUTHDG |PartNumber Description <-- Output a heading
OUTHDG |---------- ----------- <-- Output a heading
SET lastpart = partnum <-- Remember the current part number
END <-- End of BEGIN block
OUTEND |{partnum} {descrip} <-- Output the part number
The first time a line is read from the input file, the lastpart variable
will be null ("") because it has not yet been initialized. As a result,
the BEGIN block will be executed. (The OUTPAGE command will be ignored
in this first instance, since no data has been sent to the output file.)
The BEGIN block also sets the lastpart variable, which will retain that
value until it is changed.
When the second input line is read (and the POM code is run again from the
top), the BEGIN block will be run only if the current part number is
different from the previous one (which we saved in the lastpart variable).
However, if the partnum variable is different, the BEGIN block will be run,
outputting the page eject and headings, and once again saving the partnum
in the lastpart variable, so we can check it during the third input line --
and so on.
------------------------------------
Inline Incrementing and Decrementing
------------------------------------
You can add "1" to a variable in a command. For example:
SET x = "3"
SET x = x+
After the second statement, x would have the value "4". Here are some
additional examples:
- Incrementing "1" gives you "2"
- Incrementing "9" gives you "10"
- Incrementing "99" gives you "100"
The first time a variable is referenced, it has a null value (unless you
SET it yourself). If you increment a null variable, it will be changed
from "" (i.e. null) to "1".
You can also subtract "1" from a variable in a command:
SET x = "3"
SET x = x-
After the second statement, x would have the value "2". Here are some
additional examples:
- Decrementing "0" gives you "-1"
- Decrementing "1" gives you "0"
- Decrementing "99" gives you "98"
When you do an inline increment or decrement, the variable itself is not
changed. (C programmers take note!) For example:
SET y = "3"
SET x = y-
After the second line, the x variable will equal "2", while the y variable
will still equal "3".
You can use inline incrementing or decrementing in conjunction with
substrings:
SET y = "X23X"
SET x = y[2 3]+
After the second line, the x variable will equal "24", while the y variable
will still equal "X23X".
Only integer numeric values can be incremented or decremented. If you
attempt to increment or decrement another type of variable (e.g. text or a
decimal number), Parse-O-Matic will halt, and report an error.
-------------
Line Counters
-------------
If your input record is divided over several lines (due to its original
format or perhaps because you used the SPLIT or CHOP command), it is
helpful to set up a line counter. The following example extracts the first
six characters of the second line of input records that span three lines
(designated lines 0, 1 & 2):
IF LineCntr = "1" THEN MyField = $FLINE[1 6]
OUTEND LineCntr = "1" |{MyField}
IF LineCntr = "2" THEN LineCntr = "" ELSE LineCntr+
For an alternative to line counters, see "The ReadNext Command".
-------
Tracing
-------
By setting the DOS variable POM to ALL, you can generate a trace file,
named POM.TRC. This is helpful if you have trouble understanding why your
file isn't being parsed properly. But be sure to test it with a SMALL
input file; the trace is quite detailed, and it can easily generate a huge
output file.
To save space, you can specify a particular list of variables to be traced,
rather than tracing everything. For example, to trace only the variable
PRICE, enter this DOS command:
SET POM=PRICE
To trace several variables, separate the variable names by slashes, as in
this example:
SET POM=PRICE/BONUS/NAME
This traces the three variables PRICE, BONUS and NAME.
-------
Logging
-------
Every time Parse-O-Matic runs, it creates a "processing log". This is a
text file named POMLOG.TXT, which is placed in Parse-O-Matic's home
directory. (For example, if POM.EXE is located in C:\POM, the file will
be C:\POM\POMLOG.TXT even if you run POM from another directory.) If the
file POMLOG.TXT already exists, it is renamed to POMLOG.BAK.
The processing log file POMLOG.TXT contains a report of what happened
during the last run of Parse-O-Matic. Usually, the file will be quite
short and look something like this:
COMMAND: POM TEST.POM TEST.TXT TEMP.TXT
DATE: JAN 01 1996
17:50:10 TEST.TXT opened for processing
17:50:14 TEST.TXT processing completed
The first line gives the DOS command line, while the second gives the
date. Subsequent lines give the time (Hours:Minutes:Seconds) and a
progress or error message.
If you encounter an error during processing, the text of the warning
message is saved in the processing log. It might look something like this:
COMMAND: POM TEST.POM TEST.TXT TEMP.TXT
DATE: JAN 01 1996
17:50:10 TEST.TXT opened for processing
17:50:10 Execution error in line number 3 of POM file TEST.POM
17:50:11 Required parameter is missing in OUT
If you process multiple input files, POMLOG.TXT might look something
like this:
COMMAND: POM EXAMPL15.POM DATA*.TXT TEMP.TXT
DATE: JAN 01 1996
14:21:27 DATA01.TXT opened for processing
14:21:28 DATA01.TXT processing completed
14:21:28 DATA02.TXT opened for processing
14:21:28 DATA02.TXT processing completed
14:21:28 DATA03.TXT opened for processing
14:21:28 DATA03.TXT processing completed
If for some reason the processing log can not be created, Parse-O-Matic
will continue to run; it will not terminate. For some additional comments
on logging, see "Unattended Operation".
----------
Quiet Mode
----------
Sometimes you don't want the user to see the Parse-O-Matic processing
screen. In such cases, you can use the "Quiet Mode" switch (/Q) on the
command line. For example:
POM XYZ.POM MYFILE.TXT TEMP.TXT /Q
The /Q switch suppresses the display of the processing screen. The only
time a user will see anything is if there is a problem (for example: the
input file was not found). In such case, Parse-O-Matic will make a noise
via the PC speaker, then display a message (see "Unattended Operation" and
"The MSGWAIT Command" for some background information).
-------------------
The ShowNum Utility
-------------------
The ShowNum program (SHOWNUM.EXE in the standard Parse-O-Matic package) is
a small utility which converts a hex number to decimal and vice-versa.
This is helpful if you are using a hexadecimal file dump to locate specific
numeric data in a binary file.
To find out what the decimal number 123 is in hexadecimal, enter the
following command at the DOS prompt:
SHOWNUM 123
This will display:
123 = $7B
To find out what hex 400F is in decimal, enter the following command at
the DOS prompt:
SHOWNUM $400F
The $ character tells ShowNum that the number is in hexadecimal. The
program will display:
$400F = 16399
ShowNum can handle numbers between -2,147,483,648 (hex $80000000) and
2,147,483,647 (hex $7FFFFFFF).
===========================================================================
FILE HANDLING
===========================================================================
-------------------------------------
How Parse-O-Matic Searches for a File
-------------------------------------
When Parse-O-Matic needs to read a file, it follows this procedure:
1) Parse-O-Matic tidies up the file name in the following ways:
- It removes spaces and tabs
- It converts the file name to uppercase
- As per DOS convention, slashes (/) are converted to backslashes (\)
- If this type of file has a default extension, and if the file name
does not have a period (i.e. dot) in the name, the extension is
added.
2) If the file name is fully qualified (i.e. drive and path, or both),
Parse-O-Matic tries to open that file. If it can not, it terminates
with an error message.
3) If the file name is not fully qualified, Parse-O-Matic follows this
procedure:
- It first looks for the file in the current directory.
- If then looks in the directory where the Parse-O-Matic program
(POM.EXE) is located.
- It then searches the DOS PATH for the file. (For information
about the PATH command, refer to your DOS manual.)
- If none of these steps locate the file, Parse-O-Matic terminates
with an error message.
The following types of files are affected...
TYPE OF FILE DEFAULT EXTENSION REFER TO MANUAL SECTION
---------------------- ----------------- -----------------------
POM (Control) File .POM "The POM File"
Date Information File See NOTE #1 "The POMDATE.CFG File"
Lookup File See NOTE #2 "The LookFile Command"
Properization Exception See NOTE #2 "The Proper Command"
Map File .MPF "The MAPFILE Command"
NOTE #1: The Date Information File is always called POMDATE.CFG. You can
put your standard version in the Parse-O-Matic directory. If you
wish to override it, you should place the modified copy in your
current (logged) directory.
NOTE #2: This type of file does not have a default extension. However, we
recommend "TBL" (i.e. "Table") for Lookup files and "PEF" for
Properization Exception Files.
Parse-O-Matic does NOT search for input and output files. They must be
in the current directory, or must have a fully qualfied path. If the
input file is missing an extension, it is assumed to be TXT. If the
output file is not specified in the POM command, it is assumed to be
POMOUT.TXT, in the current directory. (See "How Parse-O-Matic Opens an
Output File")
Since Parse-O-Matic searches for files, you can place frequently-used
Lookup and POM files in a directory in your DOS path.
--------------------------------------
How Parse-O-Matic Opens an Output File
--------------------------------------
Parse-O-Matic opens an output file the first time one of the output
commands (e.g. OUT, OUTEND, OUTHDG) has something to send to the file.
When opening an output file, Parse-O-Matic follows this procedure:
1) Normally, the name of the output file is specified on the POM command
line or (if it is omitted there), it is specified in an OFILE command
within the POM file.
If no output file name was given using either method, the name is set
to POMOUT.TXT (in the current directory).
2) Parse-O-Matic tidies up the file name in the following ways:
- It removes spaces and tabs
- It converts the file name to uppercase
- As per DOS convention, slashes (/) are converted to backslashes (\)
- If the file name does not have an extension, and it does not end in
a period or a colon, the extension TXT is added. Thus:
C:\XYZ becomes C:\XYZ.TXT
C:\XYZ. stays the same
C:\XYZ.DAT stays the same
LPT2: stays the same (see "Sending Output to a Device")
3) The output file name is compared to the input file name. If they are
the same, Parse-O-Matic terminates with an error. You can not send
output to the input file, nor can you read input from the output file.
4A) If the file name is preceded by a plus sign ("+"), Parse-O-Matic will
append output to the file. Here are some examples:
+C:\XYZ.TXT output will be appended to the file
+LPT1: this refers to a device, so the "+" is ignored
If the file to which you are appending does not already exist, it is
first created, as an empty file.
4B) If the file name is NOT preceded by a plus sign, the following
procedure takes place:
- If a file with the specified name already exists, it is renamed
with a .BAK extension (replacing any previous file with that name).
- The file is created, as an empty file
For example, if you run Parse-O-Matic as follows:
POM MYPOM.POM INPUT.TXT C:\XYZ.TXT
then if C:\XYZ.TXT already exists, it is renamed to C:\XYZ.BAK.
5) Output is directed to the output file until Parse-O-Matic ends or a new
output file name is specified by the OFILE command.
REMINDER: Parse-O-Matic does not open the output file until it is time to
send it some data from the output commands (OUT, OUTEND etc.). If no data
is sent to the output file, it will contain its original data (assuming it
already existed). If this is a problem, you can either delete the output
file before running Parse-O-Matic, or place the following commands in the
PROLOGUE:
ERASE "OUTPUT.TXT" <-- Delete the output file
OFILE "OUTPUT.TXT" <-- Specify the output file
If you do this, and no data is sent to the output file, the file will not
exist. You can check if POM failed by consulting the DOS ERRORLEVEL.
(See your DOS manual for an explanation of ERRORLEVEL.)
- If the ERRORLEVEL is 0 and the file does not exist, it means that POM
ran successfully, but no output was sent to the file.
- If the ERRORLEVEL is 1 or higher, and the file does not exist, it
means that POM failed, or you used the HALT command before any output
was sent to the file.
If you are calling Parse-O-Matic from a program (rather than a batch file),
you can check the error level using the facilities built in to the language
in which the program was written. For example, Turbo Pascal lets you
run another program with the EXEC command, after which you can extract the
ERRORLEVEL from the low byte of the DosExitCode variable.
---------------------------
Appending to an Output File
---------------------------
If you want to add data to the end of the output file, you have three
alternatives:
1) Use wildcards, as explained in "POM and Wildcards". In such case,
the output file is empty when the first output line is generated
(although see method #2 for an exception). When processing with
wildcards, all output is sent to the same file, unless you change
the file with the OFILE command (see "The OFile Command").
2) Prefix the output file name with a plus sign. This tells Parse-O-
Matic that you want to add data to the end of the file, rather than
starting with an empty file. You can use this method on the command
line:
POM MYPOM.POM INPUT.TXT +C:\MYFILES\OUTPUT.TXT
You can also use this method in the OFILE command:
OFILE "+C:\MYFILES\OUTPUT.TXT"
In these examples, we provided the full path name to the output file.
If you do not specify a path name (e.g. OFILE "+OUTPUT.TXT"), the
output file is placed in the current directory.
3) Use a batch file and the DOS COPY command to control the concatenation
of output files. This method is less convenient, but it allows you to
bypass the addition of the new output if there is a processing error.
Here is a sample batch file (comments appear after the arrows):
@ECHO OFF <-- Turn batch echoing off
IF EXIST OUTPUT.TXT DELETE OUTPUT.TXT <-- Get rid of old output file
POM MYPOM.POM INPUT.TXT OUTPUT.TXT <-- Parse the input file
IF ERRORLEVEL 1 GOTO QUIT <-- Quit if there was an error
IF NOT EXIST OUTPUT.TXT GOTO QUIT <-- Quit if no output generated
IF EXIST SAFETY.TXT DELETE SAFETY.TXT <-- Get rid of old safety file
RENAME MAINFILE.TXT SAFETY.TXT <-- Backup the original file
COPY SAVE.TXT+OUTPUT.TXT MAINFILE.TXT <-- Add the new output
:QUIT <-- Batch file label for GOTO
This method has the added advantage of creating a backup copy of the
original output file. If the data in the file is particularly
important, you could place the file SAFETY.TXT on another hard drive.
--------------------------
Sending Output to a Device
--------------------------
Parse-O-Matic recognizes that an "output file" is actually a device if it
has a colon (":") at the end of the name. You can direct Parse-O-Matic's
output to a standard device such as COM1: or LPT2: by specifying the device
name accordingly. For example:
POM XYZ.POM INPUT.TXT LPT1:
This directs the output to the LPT1 printer.
Parse-O-Matic can detect a "Not Ready" condition in most cases. A printer
is Not Ready when it is offline, out of paper, or its print buffer is full.
If a Not Ready condition occurs, the following happens:
- If you are running in Quiet Mode (/Q on the POM command line), a Not
Ready condition terminates Parse-O-Matic with a DOS ERRORLEVEL of 243.
- If you are not running in Quiet Mode, a message box gives you the option
of trying again, or cancelling processing. If you cancel, Parse-O-Matic
terminates with a DOS ERRORLEVEL of 244.
COM Ports
---------
If you are sending output to a COM port (e.g. COM1:) you should first set
the baud rate with the DOS MODE command, or Pinnacle Software's MODEM
program (available on our BBS and Web site).
The MODEM program is particularly useful if your COM port is driving a
modem. Parse-O-Matic talks to the operating system's COM device driver
rather than the modem itself, so before you send data to a modem, it is a
good idea to use the MODEM program to check that the modem is online and
functioning properly.
If you are using a high-speed modem (9600 bps or higher) and you find that
you sometimes lose some characters, the operating system or the modem may
not be handling a "Not Ready" condition properly during handshaking. In
such case, you may find it necessary to turn off buffering (locked DTE
speed) and run at a maximum speed of 9600 bps. For a quick course in
high-speed modems and buffering, see the Trouble-Shooting Guide included
with Pinnacle Software's Sapphire Bulletin Board System (also available on
our BBS and Web site).
For an example of sending output to a COM port, see "The Pause Command".
---------
DbF Files
---------
If Parse-O-Matic notices that the input file is a "DBase" file (i.e. it has
a DBF extension -- for example: MYFILE.DBF), it will change the way it
processes the data. For instance, the variable $FLINE is not defined.
Rather, each of the fields in the database are pre-parsed. Thus, if you
have a DBF file containing three fields (EMPNUM, NAME, PHONE), your entire
POM file might look like this:
IGNORE DELETED "Y"
OUTEND |{EMPNUM} {NAME} {PHONE}
The DELETED variable is created automatically for each record. If it is
set to "Y", it means the record has been deleted from the database and is
probably not valid. In most cases, you will want to ignore such records.
If you do not know what the field names are, you can obtain the list with
the following POM file:
TRACE DELETED
Afterwards, when you inspect the trace file (POM.TRC), you will see a
summary of all the fields. Since there are no output commands (e.g. OUTEND
and OUTHDG), the output file will be empty.
NOTE: Parse-O-Matic does not currently support DBF "Memo" fields.
-----------------
POM and Wildcards
-----------------
You can process multiple input files with the same POM file by specifying a
DOS "wildcard" at the DOS command prompt. All output is then directed to
the same output file. For example:
POM XYZ.POM *.TXT OUTPUT.TXT
This runs XYZ POM.file on each file in the current directory with a TXT
extension and sends all output to the file OUTPUT.TXT.
The POM file can determine which file it is reading by using the predefined
variable $COMMAND, which contains the current POM command line.
Consider the following scenario:
- You have installed POM.EXE in the directory path C:\UTILITY\POM
- The current directory contains ABC.POM, MARK.TXT, MARY.TXT and JOHN.TXT
- You enter the command POM ABC *.DAT OUT.TXT
Parse-O-Matic runs ABC.POM against the three TXT files. On the first input
file, $COMMAND will look like this:
C:\UTILITY\POM.EXE ABC.POM MARK.TXT OUT.TXT
On the next two input files, it looks like this:
C:\UTILITY\POM.EXE ABC.POM MARY.TXT OUT.TXT
C:\UTILITY\POM.EXE ABC.POM JOHN.TXT OUT.TXT
Note that the file OUT.TXT is NOT processed, even though it has a TXT
extension. POM will always avoid processing the output file.
Let's say you wanted to concatenate both MARK.TXT and MARY.TXT, and put the
file name at the top. You could do it with this POM file, named ABC.POM:
SET cmd = $COMMAND <-- Get the command line
BEGIN cmd <> lastcmd <-- Has it changed?
PARSE fname cmd "2* " "3* " <-- Extract the input file name
SETLEN flen fname <-- Get length of input file name
SET uline = "" <-- Initialize underline
PAD uline "L" "-" flen <-- Set underline
OUTEND lastcmd <> "" | <-- Output a linefeed unless
OUTEND lastcmd <> "" | <-- this is the first file
OUTEND |{fname} <-- Output the file name
OUTEND |{uline} <-- Output the underline
OUTEND | <-- Output a linefeed
SET lastcmd = $COMMAND <-- Remember this command line
END <-- End of code block
OUTEND |{$FLINE} <-- Output a line from the input
You could then process MARK.TXT and MARY.TXT with this command line:
POM ABC M*.TXT OUT.TXT
This processes any file starting with an "M" that has a TXT extension.
Another way to run the command is as follows:
POM ABC M???.TXT OUT.TXT
This processes any four-letter TXT file that starts with "M".
For more information about DOS wildcards, consult your DOS manual.
-----------------------
Solving Memory Problems
-----------------------
Parse-O-Matic does all of its work in standard memory; it does not use
Extended or Expanded memory. This is rarely a problem, but if you do
somehow run out of memory, there are some steps you can take...
You can often free up some extra memory by unloading unused device drivers
and DOS TSR ("Terminate and Stay Resident") programs. (TSR's are sometimes
called "DOS Pop-Ups")
Alternatively, most drivers and TSR's can be safely moved into high memory,
using the LOADHIGH function in your AUTOEXEC.BAT, or the DEVICEHIGH
function in CONFIG.SYS. Some older drivers and TSR's will not tolerate
this kind of relocation.
===========================================================================
OPERATIONAL PLANNING
===========================================================================
----------------------------
Effective Use of Batch Files
----------------------------
The built-in batch (BAT) capability of DOS and Windows is often overlooked,
even by seasoned computer professionals. You can use batch files to make
Parse-O-Matic easier to use. Batch files are created with a text editor
(such as DOS EDIT, or Windows Notepad).
Example #1: Save Yourself Some Typing
--------------------------------------
Here is a simple batch file (comments appear after the arrows):
@ECHO OFF <-- Turn off command-line echoing
POM MYPOM.POM INPUT.TXT OUTPUT.TXT <-- Run Parse-O-Matic
IF ERRORLEVEL 1 GOTO QUIT <-- Quit if an error occured
SEE OUTPUT.TXT <-- View the output file
:QUIT <-- Batch file label
The advantage of this batch file is that it saves you the trouble of typing
in the entire POM command line each time you want to parse the file.
Example #2: Streamline Your Development
----------------------------------------
Here is a batch file which is useful during the development of a POM file.
@ECHO OFF
DEVELOP 50 MYPOM.POM IN.TXT C:\MYFILES\OUT.TXT
This batch file calls DEVELOP.BAT (included with Parse-O-Matic), which
displays a menu with the following options:
INPUT ------ View input file
EDIT ------- Edit POM file
PARSE ------ Run parsing job
OUTPUT ----- View output file
QUIT ------- Finished
This lets you try parsing, view the result, make changes to the POM file
if necessary, then parse again. You will find that this technique makes
development proceed quickly.
Here is an explanation of the second line of the batch file:
DEVELOP 50 MYPOM.POM IN.TXT C:\MYFILES\OUT.TXT
: : : : :
: : : : :
: : : : Name of output file <-----
: : : : |
: : : Name of input file <-------- See NOTE #1
: : : |
: : Name of POM file <-----
: :
: Save position for menu <-------- See NOTE #2
:
Invokes the batch file DEVELOP.BAT
NOTE #1: You must provide the full path to the files (unless they are in
the current directory) and the extension.
NOTE #2: The "save position" remembers where you were in the menu. You
may use values 49 to 99 to provide a "memory" for 50 different
batch files that call DEVELOP.BAT. (The other values are
reserved for the Parse-O-Matic installation and tutorial
procedures.) If 50 is not enough, you can place additional
batch files in another directory; the menu save file (POM.MSV)
is always placed in the current directory.
In order for DEVELOP.BAT to work correctly when you are in a directory
other than the Parse-O-Matic directory, you must place the Parse-O-Matic
directory in your DOS PATH (see your DOS manual for details). Your PATH
must also include the directory to a text editor. (In the original
Parse-O-Matic package, DEVELOP.BAT calls up DOS EDIT.)
You may find it instructive to study the file DEVELOP.BAT by loading it
into a text editor. The batch file contains some comments which explain
how it works. As mentioned in one of the comments, you may wish to change
the text editor that the batch file calls for editing the POM file.
You may also find the program MENU.EXE useful. For a brief description,
type MENU /? at the DOS prompt. To study a typical menu definition file,
enter the command SEE POM.MNU at the DOS prompt.
Example #3: Automatic Batch Files
----------------------------------
Let's say that each day you have a text file, named DELLIST.TXT, which
lists the names of the files that need to be deleted:
FRED.TXT
MARY.TXT
JOHN.TXT
HARRY.TXT
You could write a POM file (we'll call it MAKEDEL.POM) to write a batch
file to delete the files. It would look like this:
PROLOGUE
OUTEND |@ECHO OFF
END
IGNORE $FLINE = "COMMAND.COM" <-- An example of a safety feature!
OUTEND $FLINE <> "" |DEL {$FLINE}
You could automate the entire procedure with the following batch file
(which we'll call DAILYDEL.BAT):
@ECHO OFF <-- Turn off command-line echoing
POM MAKEDEL.POM DELLIST.TXT TEMP.BAT <-- Create the batch file TEMP.BAT
IF ERRORLEVEL 1 GOTO QUIT <-- Quit if an error occured
TEMP.BAT <-- Run the batch file
DEL TEMP.BAT <-- Delete it
:QUIT <-- Batch file label
The second line of DAILYDEL.BAT runs Parse-O-Matic to create a batch file
named TEMP.BAT. Given the input file shown earlier, TEMP.BAT would look
like this:
@ECHO OFF
DEL FRED.TXT
DEL MARY.TXT
DEL JOHN.TXT
DEL HARRY.TXT
After TEMP.BAT is created, DAILYDEL.BAT runs TEMP.BAT (thus deleting all
the files listed in DELLIST.TXT).
This is only a simple example. Parse-O-Matic's ability to create batch
files based on input data provides you with a very powerful tool for
automating daily administrative tasks.
When you write automatic applications, you should be careful to include
routines in both the batch files and the POM files to handle any unusual
conditions. In MAKEDEL.POM, we checked the file to be sure that it wasn't
"COMMAND.COM", because if that file is deleted, your system will probably
stop working!
Example #4: Controlling a POM File from the Command Line
---------------------------------------------------------
Consider the following batch file, which we will call SELECT.BAT:
@ECHO OFF <-- Turn off command-line echoing
IF (%1) == () GOTO ERROR <-- Make sure we have a parameter
SET XYZ=%1 <-- Set the environment variable XYZ
POM SELECT.POM INPUT.TXT OUT.TXT <-- Run the POM file SELECT.POM
GOTO QUIT <-- Jump to the QUIT label
:ERROR <-+
ECHO Missing parameter | Error-handling routine
PAUSE <-+
:QUIT <-- Batch file label
SET XYZ= <-- Get rid of the environment variable
SELECT.BAT can be used with this POM file, which will we name SELECT.POM:
PROLOGUE
GETENV xyz "XYZ"
END
OUTEND $FLINE ^ xyz |{$FLINE}
You can use SELECT.BAT to output only those lines that contain the variable
that you specify. For example, you can enter the following command at the
DOS prompt:
SELECT MARY
This will output only those lines (from INPUT.TXT) that contain "MARY".
If you wish to ignore distinctions between uppercase and lowercase, change
the last line of SELECT.POM accordingly:
OUTEND $FLUPC ^ xyz |{$FLINE}
Batch file parameters are separated by spaces on the command line, so the
following command would not work as you might expect:
SELECT MARY FRED JOHN
This would set the batch variable %1 to MARY, %2 to FRED and %3 to JOHN.
One way to deal with this is to eliminate the spaces when you run the
batch file:
SELECT MARY/FRED/JOHN
You can then replace the OUTEND command in SELECT.POM with these lines:
APPEND x xyz "/" <-- Set the x variable to "MARY/FRED/JOHN/"
BEGIN x <> "" <-- We will loop through all of the names
PEEL y x "" "/" <-- Move a name to the y variable
OUTEND $FLUPC ^ y |{$FLINE} <-- Output a line if it contains the name
AGAIN <-- Go back to the BEGIN
Bear in mind that the system environment space is limited. If you have
problems with an application like this one, refer to "The GETENV Command",
in the section entitled "Disappearing Environment Variables".
------------------------------------------
Running Parse-O-Matic from Another Program
------------------------------------------
If you are calling Parse-O-Matic from a program written in a high-level
language (such as Pascal, Delphi, C or Basic), you can check its success or
failure by consulting the "DOS Error Level". Most languages have built-in
facilities to test this value.
For example, Turbo Pascal lets you run another program with the EXEC
command, after which you can extract the ERRORLEVEL from the low byte of
the DosExitCode variable. You can also check the DOSERROR return code
to check for invocation errors. Some typical errors include:
Program not found, Path not found, Access denied, Not enough memory.
Windows developers should avoid running Parse-O-Matic in a minimized window
because if an error occurs, the user will not see the message. In such
case, Parse-O-Matic can terminate after a suitable delay (see "The MSGWAIT
Command"), but the mysterious pause might cause the user some concern.
On long parsing jobs (taking 3 seconds or more on your slowest machine),
it is perhaps best to let the user see the processing screen rather than
running in Quiet Mode (see "Quiet Mode"). If nothing else, it gives him
or her something to look at, and provides assurance that the machine has
not locked up.
--------------------
Unattended Operation
--------------------
You can design applications that run themselves while you are not there.
There are two reasons why you might want to do this:
- You can run long processing jobs just before leaving work at night
- Parse-O-Matic is useful, but it isn't very interesting to watch!
Several features of Parse-O-Matic facilitate "unattended operation".
- The SOUND command can alert you if something unusual happens; you don't
have to stare at the screen to make sure that everything is working.
- All error messages (which say "Press a key to continue") make a noise
via the PC speaker (see "The Sound Command").
- You can use the MSGWAIT command to let processing continue if there is
an error (see "The MSGWAIT Command").
- The processing log (see "Logging") can be used to check processing.
Let's say you wanted to concatenate (add together) several enormous text
files. You could start with the following POM file (named ADD.POM):
SET cmd = $COMMAND
BEGIN cmd <> lastcmd
SOUND "BEEP"
SET lastcmd = cmd
END
OUTEND |{$FLINE}
You could then enter the command POM ADD.POM *.TXT ALL.TXT and walk away.
Whenever a new file is started, you'll hear a beep. When you come back,
you can check the file POMLOG.TXT (which will be located in the same
directory as POM.EXE). It might look something like this:
COMMAND: POM ADD.POM *.TXT ALL.TXT
DATE: JAN 01 1996
16:39:12 JOHN.TXT opened for processing
16:45:28 JOHN.TXT processing completed
16:45:29 MARY.TXT opened for processing
16:52:10 MARY.TXT processing completed
16:52:11 FRED.TXT opened for processing
17:03:33 FRED.TXT processing completed
If you are processing multiple files, and each one uses a different POM
file (and hence requires a separate run of Parse-O-Matic) you can write
your batch file so that it renames the log files. This lets you review
each log file later. For example:
@ECHO OFF
POM JOHN.POM JOHN.TXT JOHN.LST
RENAME C:\POM\POMLOG.TXT JOHN.LOG
POM MARY.POM MARY.TXT MARY.LST
RENAME C:\POM\POMLOG.TXT MARY.LOG
POM FRED.POM FRED.TXT FRED.LST
RENAME C:\POM\POMLOG.TXT FRED.LOG
When processing is complete, the files JOHN.LOG, MARY.LOG and FRED.LOG
will be available in the directory C:\POM for your inspection.
Here is a slightly more sophisticated version of the batch file:
@ECHO OFF
POM JOHN.POM JOHN.TXT JOHN.LST
IF ERRORLEVEL 1 GOTO QUIT
RENAME C:\POM\POMLOG.TXT JOHN.LOG
POM MARY.POM MARY.TXT MARY.LST
IF ERRORLEVEL 1 GOTO QUIT
RENAME C:\POM\POMLOG.TXT MARY.LOG
POM FRED.POM FRED.TXT FRED.LST
IF ERRORLEVEL 1 GOTO QUIT
RENAME C:\POM\POMLOG.TXT FRED.LOG
:QUIT
The IF ERRORLEVEL lines jump to the end of the batch file if Parse-O-Matic
generates an error of 1 or higher. When coding batch files, remember that
the IF ERRORLEVEL command is considered "True" if the error is the
specified value or higher. This means you should always test the higher
value first. See your DOS manual for details.
--------
Examples
--------
Many of the techniques described in this manual are demonstrated by the
examples provided with the standard Parse-O-Matic package. To see these
examples, switch to your Parse-O-Matic directory, type START at the DOS
prompt, or run START.BAT from Windows or OS/2, then select TUTORIAL.
===========================================================================
RUNNING UNDER WINDOWS
===========================================================================
-------------
Compatibility
-------------
Parse-O-Matic is a DOS program, which has a few advantages and a few minor
disadvantages for Windows users.
The primary advantage is that a Parse-O-Matic application can run on any
PC-compatible machine, whether it is running DOS, Windows, or OS/2.
Emulators are also available which will let you run Parse-O-Matic (and
other DOS software) on Macintosh computers.
Since Parse-O-Matic has no user interface to speak of, Windows' wonderful
graphical environment is not particularly important. The only operational
difference is that to interrupt Parse-O-Matic processing, you press the Esc
key instead of clicking on a Cancel button.
Performance is a consideration if you are running Parse-O-Matic at the same
time as 32-bit applications under Windows 95 or NT; it will slow them down
slightly. However, unless you are multi-tasking heavily, performance is
not an issue because the usual bottleneck is the responsiveness and
transfer speed of the hard disk, not the speed at which the Parse-O-Matic
program runs.
-------------------------
Setting Up for Windows 95
-------------------------
To use Parse-O-Matic under Windows 95, you need the following items, which
are included in the standard Parse-O-Matic package:
1) The POM file (icon file POM_FILE.ICO)
2) A batch file (icon file BAT.ICO)
These two icon files are included in the standard Parse-O-Matic package.
You may find it helpful to copy them to your main Windows directory so
that the the associations you set for them are not lost if you install
a new version of Parse-O-Matic and then delete the original POM directory.
Setting Up an Association for the POM File
------------------------------------------
When you click on a POM file, it should call up a text editor. To configure
this, follow these steps:
1) Double-click on "My Computer"
2) From the pull-down menu, select View/Options
3) Dialog Box: Options
Click on the File Types tab
Click on the New Type button
4) Dialog Box: Add New File Type
Description: Parse-O-Matic Control File
Associated extension: POM
Click on the New button
5) Dialog Box: New Action
Action: &Edit
Application used: NOTEPAD.EXE (or the path to your favourite editor)
Click on the OK button
6) Dialog Box: Add New File Type
Click on the Change Icon button
Click on the Browse button
File name: The full path to POM_FILE.ICO (e.g. C:\POM\POM_FILE.ICO)
Press Enter
7) Dialog Box: Change Icon
Click the OK button
8) Dialog Box: Add New File Type
Click the Close button
9) Dialog Box: Options
Click the Close button
Once you have followed these steps, you can double-click on the POM file
icon when you are in Windows Explorer or a file folder, and it will be
opened with the file editor you specified in step 5.
Setting Up an Association for the BAT File (Optional)
-----------------------------------------------------
Windows 95 is already set up to process batch (BAT) files. However,
Parse-O-Matic comes with an alternative icon which is more distinctive than
the one supplied with Windows. (The Parse-O-Matic icon looks like a bat --
a sonar-equipped flying critter with the undeserved bad reputation).
To change the icon, follow these steps:
1) Double-click on "My Computer"
2) From the pull-down menu, select View/Options
3) Dialog Box: Options
On the list box, find and double-click on MS-DOS Batch File
4) Dialog Box: Edit File Type
Click on the Change Icon button
5) Dialog Box: Change Icon
Click the Browse button
File name: The full path to BAT.ICO (e.g. C:\POM\BAT.ICO)
Press Enter
6) Dialog Box: Change Icon
Click the OK button
7) Dialog Box: Edit File Type
Click the Close button
8) Dialog Box: Options
Click the Close button
After following this procedure, your batch (BAT) file icons will be much
more noticeable when they appear in Windows Explorer or a file folder. To
edit the batch file, right-click on the icon and select Edit. To run the
batch file, simply double-click the icon.
For a discussion of batch files, see "Effective Use of Batch Files".
------------------------------
Installing the ShowNum Utility
------------------------------
The ShowNum program is a small utility which converts a hex number to
decimal and vice-versa (see "The ShowNum Utility").
To install ShowNum as a Windows 95 shortcut:
1) Select "File/New/Shortcut" from the pull-down menu of any folder.
2) Specify the path name to SHOWNUM.BAT, followed by a question mark.
For example: C:\POM\SHOWNUM.BAT ?
The ? means "prompt for input before calling the batch file".
3) After you have finished defining the shortcut, right-click on the icon,
select "Properties", then the "Program" tab, and make sure the "Close
on exit" box is checked off.
You can then use ShowNum by double-clicking its icon. You will be prompted
to enter a number, and the answer will be displayed.
------------------------
Long File Names in Win95
------------------------
Although Parse-O-Matic can be run under Windows 95, it will only recognize
standard DOS file names; it does not use the long file names supported by
Win95. You can determine the underlying DOS name of a file by checking its
"Properties" in Windows Explorer, or by using the DIR command while in DOS
mode.
===========================================================================
LICENSING
===========================================================================
This product is available in several forms. For billing and pricing
information, view or print the files OPTIONS.DOC and ORDER.FRM.
TRIAL COPY: If you have a "test-drive" evaluation copy, you will see a
"Registration Reminder Screen" when you start up the program. You are
entitled to evaluate this program at no cost for 3 months. If you continue
to use it after that, you must register your copy and purchase a license,
as described below.
SINGLE-USER LICENSE: When you register an evaluation copy of this product,
you will receive the latest version, plus an unlocking code that will let
you register any new evaluation versions that we release for a period of
two years (six years for deluxe registration).
SITE/MULTI-COPY LICENSES: If you plan to run 15 or more copies of this
program (on a network or on separate computers), you can obtain quantity
pricing. For details, view or print the text file ORDER.FRM.
LAN LICENSE: Local Area Network users must purchase a license for each
user (see "Single-User License" and "Site/Multi-Copy Licenses"), although
they can reduce this amount if they have run-control software which sets an
upper limit on the number of concurrent users for a given program.
WAN LICENSE: Wide Area Networks are treated like LANs, but you may find it
more economical to purchase a Distribution License (see below).
DISTRIBUTION LICENSE: The distribution license allows you to use an
unlimited number of copies. You may include it in your application or
commercial package as a utility. The only restriction is that you may not
distribute this document (i.e. the user manual) or its essential content.
With this safeguard, we avoid placing ourselves in competition with you;
the program must be used to support an application or product rather than
being its main feature.
SOURCE CODE LICENSE: If you purchase the Turbo Pascal source code, you
must also purchase a license for each machine that will run the modified
program. Those portions of the source code written by Pinnacle Software
remain copyrighted by Pinnacle, and may not be divulged to another party.
As an alternative to purchasing the source code, you can also contract for
us to make custom modifications to the program.
RETAIL LICENSE: You can sell complete, registered copies of this product,
complete with documentation, in return for royalties. The terms depend on
volume and advance payments. Contact us for details.